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Second, a family of continuous random coefficient models with 
£-Laplace distributions are examined. The %-Lapiace distribution is 
described along with a useful eranerornat ion The correlation structure 
for special cases is derived. For a special case when %£ is one, the 
BELAR(1) model with Laplace marginals, the maximum likelihood estimator 
of serial correlation is derived. Least squares estimates are also 
derived using the concept of a linear residual. These estimators of 
correlation, along with other estimators of location and scale are 
compared in a small simulation study. 

Thirdly, the NLAR(1) and the BELAR(1) processes are compared using 
higher order residual analyses based on the uncorrelated, but dependent 
linear residuals, (RJ. 
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I. INTRODUCTION 


In standard time series analysis, one assumes.the marginal 
distributions of (X ) ape Normal, i.e. Gaussian. However, a Gaussian 
distribution will not always be appropriate. In earlier works by Gaver 
and Lewis [Ref. 1]; Jacobs and Lewis [Refs. 2,3]; and Lawrance and Lewis 
[Refs. 4,5,6], stationary non-Gaussian time series models were developed 
for variables with positive and highly skewed marginal distributions. 

There still remain other situations for which Gaussian marginals are 
inappropriate, i.e. the marginal time series variable being modelled, 
although not skewed or inherently positive valued, has a large kurtosis 
or long-tailed distribution. The position errors in a large navigation 
system have such a distribution. In particular, Hsu [Ref. 7] modelled 
pooled position errors using the double exponential distribution (also 
called the Laplace distribution). Also MeGill [Ref. 8] showed that the 
Laplace distribution provides a characterization o BO error n E 
timing device under periodic excitation. Speech-waves are modelled 
using Laplace variables (Davenport [Ref. 9]). In the "speech- like" 
process given by the linear AR(1) model 


% = OX gee D (1.1) 


where .8 <š c § .9, the innovation sequence (E? is i.i.d. Laplace (Linde 


and Gray [Ref. 10]). In image coding systems using a two-dimensional 
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discrete cosine, DC, transform, Reininger and Gibson [Ref. 11] showed 
that the Laplace distribution gives the best approximation to the 
distribution of the non-DC coefficients. Recently Sethia and Anderson 
[Ref. 12] required a stationary autoregressive process with Laplace 
marginals in their research in communications technology. 

Even before Gaver and Lewis [Ref. 1] wrote the pioneering paper on 
the subject of autoregressive processes with a specified non-Normal 
marginal distribution, Gastwirth and Wolff [Ref.13] had derived a 


solution to the linear additive first-order difference equation 


X = pX +E, O) 


for which (X, ] is marginally Laplace. This result was used later by 
Gastwirth and Rubin [Ref.1!] within the context of robust estimation on 
dependent data. This solution to (I.2) is here called the Laplace 
First-order Autoregressive Process (LAR(1)). The early solution of 
(I.2) is mentioned at this point, merely to further substantiate the 
claim that non-Normal, heavy-tailed distributions are of interest. 

In this thesis, several time series models with a specified 
symmetric, heavy-tailed marginal distribution are presented. This 
distribution, called the &-Laplace distribution, includes the Laplace 
distribution as a special case. The approach in Chapter II extends the 
discrete random coefficient model of Lawrance and Lewis [Ref. 6], New 
Exponential Autoregressive Moving Average--NEARMA(p,q), to the case 


Wmnerestne marginal distribution iss lmaplace, also ealled deb le 
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Exponential. This class of models is calleq ire pi o 
Autoregressive Moving Average model, NLARMA(p,q). several special cases 
of NLARMA(p,q) are individually researched. The second-order 
autoregressive model, NLAR(2), is established by showing the conditions 
for existence and uniqueness and by specifying the innovation structure. 
The correlation structure of NLAR(2) is also given along with results 
concerning directional moments and partial time reversibility. 

For the case when p = 1 and q = 0, called NLAR(1), the distribution 


of the difference X X. - is derived, providing some insight into the 


1 
nature of the differenced NLAR(1) model. The conditional density of X. 


given X, is also derived, which leads to a brief investigation of the 


= 
likelihood function. Parameter estimation in NLAR(1), however, is 
limited to comparisons of the moment estimators and the least squares 
estimators for the independent model parameters of serial correlation. 

The correlation structure is derived for other models in tue 
NLARMA(p,q) family: the first-order moving average called NLMA(1); the 
first-order mixed model called NLARMA(1,1); and the special cases of 
DEED autoregressive models called TLAR(p) that are analogous to the 
TEAR(p) model of Lawrance and Lewis [Ref. 6]. These models demonstrate 
the flexibility of the NLARMA(p,q) family. 

In Chapter III, a family of stationary time series is developed 
using continuous random coefficients in the additive difference equation 


model. The marginal distribution is specified to be a member of the so- 


called £-Laplace distributions, the properties of which are described at 


2 


the beginning of the chapter. The "square-root Beta-Laplace" transforn 
is defined. It is used to formulate the £-Laplace time series models. 

For the special case when $ - 1, the marginal distribution is again 
Laplace. The autoregressive model is called the Beta-Laplace First- 
Order Autoregressive model, BELAR(1). The conditional density of X. 
given e is derived. This leads to the derivation of a likelihood 
function and a numerical technique to evaluate and maximize the 
likelihood function with respect to the model parameter for serial 
correlation. 

Several facets of the parameter estimation problem are investigated 
for BELAR(1). The behavior of different estimators of scale and 
location are compared using the Simulation Testbed (SIMTBED) of Lewis, 
Orav and Uribe [Ref. 15]. The least squares estimation theory is 
derived around the concept of a linearized residual. Asymptotic 
properties are derived using results from Nicholls and Quinn (Ref. 16]. 
Robust estimators are defined and simulated in SIMTBED. Finally, a 
numerical scheme for finding the maximum likelihood estimator of serial 
correlation is used in a small simulation study of the small sample 
properties of the maximum likelihood estimator. 

In the last section of Chapter III, a first-order moving average 
model is discussed. A aie onder moving average model in &-Laplace 
variables is also derived. 

The random coefficient approaches are not the only ways to generate 
Laplace or other variables with a specified correlation structure. The 


literature contains numerous articles on generation of random sequences. 
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One approach put forth in several papers (Gujar and Kavanagh [Ref. 17]; 
Haddad and Valisalo [Ref. 18]; Li and Hammond [Ref. 19]; Liu and Munson 
[Ref. 20]; Sondhi [Ref. 21]) involves passing white Gaussian noise 
through a linear filter followed by a zero memory nonlinear transform. 
This is a general procedure that produces exactly the required marginal 
distribution and a good approximation to the autocorrelation structure. 
However, the scheme lacks the simplicity of either of the methods being 
proposed. Moreover, the filtering approach produces, for example, in 
the first-order autoregressive case, oul Ton pss ss 

It is important to note that in non-Normal time series, there are 
infinitely many processes with a given marginal and autocorrelation 
Structure. This is the case, for example, in the two-parameter NLAR(1) 
process. The differences in these processes must be explored through 
higher joint moments. In Chapter IV, residual analyses using fourth 
joint moments are derived. The ideas are modifications of those from 
Lawrance and Lewis [Refs. 6, 22], who accomplished an analysis using 
joint third moments within the NEAR framework. The residual analysis is 
applied to show the differences in the various NLAR(1) processes and the 
BELAR(1) process. 

In Chapter V, open problems and possible extensions of the analyses 
given in this thesis are discussed. Possible applications to the 


analysis of wind velocity data are detailed. 
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II. DISCRETE RANDOM COEFFICIENT MODELS WITH LAPLACE MARGINALS 


A. INTRODUCTION 

Two aspects of modelling with dependent random variables are widely 
uj ced--the marginal distribution and the correlation structure. It is 
widely known how to generate sequences with either a specified marginal 
distribution or a particular correlation structure. Transforming the 
random variables may have an undesirable and unknown effect on tne 
correlation structure. Likewise, the marginal distribution of a 
filtered process may be unknown. 

It is the generation of random variables with both a specified 
marginal and a specified correlation structure that is discussed in this 
chapter. Specifically, we want sequences with a Laplaee (double 
Exponential) marginal distribution and with ARMA(p,q) correlation 
struetures as given by Box and Jenkins [Ref. 23] for the usual linear 
ARMA(p,q) models. 

The following is an example of a process that has Laplace marginals. 
Let (X ) be a binary Markov chain with transition matrix P, so that 


NDIPO 01 = a PIX =l|X SUM NOS 


es 
Let Un = (-1) 


> A O 


eso -1] 51-0 E, where (E ) is an i.i.d. 


2* 


Exponential sequence. If A, =a,=0, (L. ) has a Laplace marginal 


distribution. However, the correlation structure is not that of an 


Mor ocess. It is, in fact, easy to see that Corr (L> ba-g’ = 
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(1/2) (22-1) 41 , for k=+1,+2,..., which is not a pure geometric function 
Ofek. 

Two processes which produce an AR(1) correlation structure and a 
Laplace marginal distribution are the Laplace Discrete AR(1), LDAR(C 1), 
which is an adaption of the DAR(1) process of Jacobs and Lewis [Ref. 2], 
and the linear process of Gastwirth and Wolff [Ref. 13], called the 
LAR(1) process. The LDAR(1) model produces an (Xx) sequence using the 


first-order autoregressive equation with random coefficients 


E O AD (ICAN) 
where am is an i.i.d. sequence of Bernoulli random variables with 
PU = Veg) aaa: (LJ is an i.i.d. sequence of Laplace random 
variables. The coefficient and innovation processes from time n are 


assumed to be independent of X X .... This sequence produces runs 


-1* "n-2? 
of constant value when successive realizations for Va produces the value 
1. When Ya is zero, a new value is selected. Although LDAR(1) is of 
limited value in general application because of this runs property, it 
is significant in that it is one of the first in a series or mork 
general discrete random coefficient equation models for non-Normal time 
series, and it produces a first-order autoregressive Markovian process 
for any specified marginebedist DUC 

The LAR(1) model turns out to be a special case of the more general 


process called the New Laplace Autoregressive Moving Average model, 


NLARMA(p,q). Properties of the LAR(1) process are pointed out in the 
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mext section of this chapter, which gives a characterization of the 
Laplace distribution. 

The NLARMA(p,q) model is a very useful family of time series models 
that are discrete random coefficient linear difference equations. The 
models are extensions of the NEARMA(p,q) structure of Lawrance and Lewis 
[Refs. 4,5,6] to those cases where the underlying marginal distribution 
is Laplace rather than Exponential. The family provides great 
flexibility to systems modelling, because of the broad range of 
correlations and different dependency structures which are obtainable. 

Section C is an examination of the second-order autoregressive model 
of the family, NLAR(2), for p = 2 and q = O in NLARMA(p,q). Conditions 
for the existence and uniqueness of the strictly stationary NLAR(2) 
model are derived using results from Nicholls and Quinn [Ref. 16] about 
Random Coefficient Autoregressive models of order k, RCA(k). Ina 
proof, very similar to that given by Lawrance and Lewis for the NEAR(2) 
model [Ref. 6], the innovation for the NLAR(2) model is derived 
explicitly. The innovation is shown to be a convex combination of 
scaled Laplace variables. The correlation structure in the NLAR(2) 
model is shown to satisfy the Yule-Walker type equations just as do the 
linear AR(2) models. Aspects of directionality and time reversibility 
are also addressed. 

In Section D, the first-order autoregressive model, NLAR(1), is 
described. It is a two-parameter, first-order Markov process which is a 
special case of the NLAR(2) model. The distribution of differences is 


derived. The conditional density of X, given X. -1 and the likelihood 


2T 


funetion are also derived. The non-differentiability of the likelihood 
function for all values of the two parameters has pee envcas - 
development of the maximum likelihood estimators. Parameter estimation 
is discussed within the context of moment estimators and least squares, 
using the usual linearized residual. 

In Section E, several different special cases of NLARMA(p,q) are 
formulated and briefly discussed. The correlation structure for a 
first-order moving average model, NLMA( 1), and a mixed autoregressive 
moving average model, NLARMA(1,1) are given. Correlation structure is 
derived and parameter estimation is discussed for the general p^ order 
autoregressive models, TLAR(p), which are special cases of the NLAR(p). 

Each of these models in Section E could well be the basis for 
further research. The intent at this point is primarily to further 
substantiate the claim of wide versatility and tractability in modelling 
non-Normal time series within the context of the NLARMA(p,q) family. 

For example, the bivariate AR(1) process with Exponential marginal 
distributions of Dewald and Lewis [Ref. 24], can be extended to the case 
where the marginal distribution is Laplace. This, however, is not 
discussed further in this thesis. 

B. CHARACTERIZATION OF THE LAPLACE DISTRIBUTION 
1. Properties of the Laplace Distribution 
The Laplace distribution is also known as the double Exponential 
distribution" In general, the density of a Laplace distributed 


variable, L, has two parameters--a location parameter -o < y < +”, and a 


28 


Scale parameter À > 0. The parameter y is fixed here at zero. For 


-œ < x < æ we have 


1 
£553) = op exp(- |x]/3). (ITOBe1:1) 
In what follows we will define (L. ) as a sequence of i.i.d. 
random variables of the Laplace distribution with A = 1 (Standard 


Laplace). The characteristic function of the standard Laplace variable 


is 


SO (AB. 2) 


and we have 


A 0 Ee ee ebe 
E (8.1.3) 
n! if n is even, 


so that E(L) = 0, Var(L) = 2, skewness is zero, and kurtosis is 3. The 
value of the kurtosis indicates that the symmetric Laplace distribution 
has heavier tails than the normal distribution, for which the kurtosis 
ise 0). 

The sum of n 2 2 i.i.d. standard Laplace variables can be 
written as the difference of two i.i.d. random variables Y. ç Y, with 


Gamma distribution, shape parameter k = n and scale parameter À = 1. 
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This follows immediately from the characteristic fume rer: Let 


n 
SE y L,; then 


i=1 


“. UT DEM 
by Cw) S — ü rt A = oy s MTB. ley 
This result is quickly generalized. Replacing n by t > O, we see that 
(o (ail is the characteristic function for the variable X = Y, = Y. 


where Y, ~ Gammalt, D in n orana H and Y. are independent. This 
demonstrates that the Laplace distribution is infinitely divisible. 

Another useful result is obtained from (II.B.1.4) when n » 1. 
It shows that a Laplace variable is the difference of two i.i.d. 
exponential (A = 1) variables. This makes it quite simple to generate 
Laplace distributed variates in computer simulations. 


Random variables with a standard Laplace distribution are self- 


decomposable. Let 
$ (o) = o, (w)/>, (pw), OSEE CII.B. 195) 


According to Feller (Ref. 25: “p. 509]. if $ Cw) is the transform of a 
random variable for each 0 £ p < 1, then L is said to be self- 
decomposable. But for -w <X y < o 


r a ou 


b Cu) 


Ir (II.B.1.6) 


fo + (1-7 p)CU E ID CRUENTO DUE 
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POEM (83 (1 + Na". (II.B. 1.7) 


We recognize (II.B.1.6) as the product of the characteristic functions 
Siwtwo 1.1.d. innovation variables, €. and 765) as described in the 


EAR(1) process in (Ref. 1]. Also from (II.B.1.7) 


€ = (T1: B. 1789 


2. The Laplace First-Order Autoregressive Process, LAR(1) 
The i.i.d. sequence teu] with distribution given in (EE.B.1.8) 
is the innovation process of a first-order linear autoregressive 


equation 


X = pX te, KUUWA 


where {xX} is a stationary time series with double exponential marginal 
mi iribution, Boc. This is the LAR(1) model. It is actually a 
rediscovery in light of the fact that Gastwirth and Wolff [Ref. 13] had 
derived it earlier; also, Gastwirth and Rubin [Ref. 14] discuss it 
within the context of robust estimation techniques. The present account 
of LAR(1) includes new results. 

The LAR(1) model has the same properties as the EAR(1) model in 
ER with two important differences. First, if -1 < p < 0, negat me 


serial correlations for odd lags are obtained. Secondly it. is 


3] 


partially time reversible in the sense that for all g% and n both of tie 


following are true: 


: = 2 = e ° 
E(XX, ig? E(X, X. 19) OF Ciel Bir) 
> = < = 
P(X z X 1) EE S I nd 1⁄2. (TIB. 2.3) 


These results are derived in Section II.C and Section IrI.D. Note; 
however, that since LAR(1) is a linear AR(1) model with non-Gaussian 
innovation fe}, it is not fully time reversible (Weiss [Ref. 26]). 
Also, note that this LAR(1) model has the zero defect property; when 


c xD then X /X 
n omn 


a = p and p can be determined exactly in long enough 


zu 
runs of the series (XJ. This property is generally undesirable, but 
the broader NLAR(2) model developed in the next section is free of this 
defect, except for the special parameter values for which it reduces to 
the LAR(1) model. 

If no repeats are observed in a realization of the time series, 
an extremely efficient estimator of p for LAR(1) is the median of the 


ratio Xi/ Xi. The simulation resulto Teim Inn babe ioe OR 


1 ° 
substantiate this claim. In Section II.D.H and again in bb bae using 
the framework of the Simulation Testbed (SIMTBED) [Ref. 15], we will see 
that this median ratio is for small! samples very biased, and tan 


apparently, asymptotically biased in all of the random coefficient AR(1) 


models with a Laplace marginal distribution that we examine. 


32 


TABLE IL B 2.1 
Simulation Results using Median (Xi /X,. i) to Estimate 


p in the LAR(1) Process for Samples of Size 2000 


True p 6 = med (XX 4) Comments 

-.9 -.9 -.9 occurred 1586 times 
in 1999 ratios 

2 -.2 -.2 occurred 75 times 
in 1999 ratios 

ES] -.08746 -.1 occurred 11 times 
in 1999 ratios 

*.01 *.01986 *.01 never occurred in 
1999 ratios 

5 Eb *.5 occurred 490 times 
in 1999 ratios 

EN +, 75 t. (5 occurred 189 


times in 1999 ratios 


C. A SECOND ORDER AUTOREGRESSIVE LAPLACE TIME SERIES MODEL, NLAR(2) 
Il tncroduetrion 
Using the terminology from [Ref. 6] the following time series 
model called NLAR(2), New Laplace Second-order Autoregressive model is 
proposed. This is a special case of NLARMA(p,q) model with p = 2, 
q = 0. The NLAR(2) model has four parameters, double exponential 
Malena” distribution for (X ), second-order autoregressive Markov 


dependence, and autocorrelations satisfying Yule-Walker type equations. 


ES 


The stationary NLAR(2) model has the same form as the stationary 
NEAR(2) model in [Ref. 6]. Writing the time series (Xj in the form or 


an additive, linear, random coefficient autoregressive geeen rc 


equation, we have for all n that 


= t 
X B, Ka X 


" 
S * B Ka X -2 t E? (EDO PI 


= =2 


where (K^, KE is a sequence of i.i.d. discrete bivariate random 


variables with distribution 


(1,0) VP. Ou: 
däit E = (O 1) VD. Ga n = 0-71, ri 


(050) w. p m A, TA» E 


le) is an i.i.d. innovation sequence whose distribution is given in 
(11.C.2.4); and [e ) and (Kr, K^] are mutually independent and 


,X por... The parameter space is defined by 


independent of X. 


=1 n-2 


S. 1.andu). Sos ES As S 1. Graphs of the 


0 $ [8] i 1 


admissible regions in the parameter space and the correlation space are 
presented in Section II.C.3. 

Equations (II.C.1.1) and (II,G l 2J have a dirc Phon 
interpretation. The observed process at time n, Xn? 1s only one of 
three possibilities: i) X is some multiple of what it was at time n-1, 
8 X,» Plus some random noise E ino X. is some multiple (possibly 


1 


different than B.), of its value at time n-2, BA o plus some 
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independent random noise; iii) X is just random noise, € independent 
of everything up to time n. 
2. Existence and Uniqueness 
The work of Nicholls and Quinn (Ref. 16] on random coefficient 
autoregressive models is relevant to the NLAR(2) process. They have 
given the necessary and sufficient conditions for the existence of tne 
unique covariance stationary solution to the following class of 


univariate random coefficient autoregressive models of order p, RCA(p): 
p 
Z = L e DIE cse Ge (IEC ICO 


AER + + 2, ..., where 
acne Yi'S are real constants; 
b. (B ) is a p-vector, second-order stationary, independent process 
with E(B) = 0 and constant covariance matrix; 
C. te] is a scalar, second-order stationary, independent process, 
independent of (B ), with Elie EE pn. 
They also have shown that if (B ) and le) are i.i.d. processes, 
then the solution (Z ) is strictly stationary and ergodic. 
Let Y, = 0,8, Pop S012 3n B (1) = BIR. E a, ) and B (2) = 
PS = a5). Then (II.C.1.1) and (II.C.2.1) have the same form. That 
is (II.C.1.1) is an RCA(2) model if the innovation of NLAR(2) satisfies 
condition (iii) above. Thus applying the results in [Ref. 16: p.31 and 


p.37], there exists a unique strictly stationary and ergodic solution to 


anm 2 4) for Y and B (i) as defined above, if and only if all of the 


SE 


roots of the characteristic equation 
2 SE 2 20 2 d 
ER a, Bt aB5) (t a8 5) O HT C 2D 


are within the unit circle mm .c n 2,84 + a8 5 < 1. Thisis satishied 
for the conditions on the parameters defining NLAR(2), thus establishing 
the existence of the model (II.C.1.1). 

No marginal distribution is ascribed to solutions of the general 
RCA(p) models in (Ref. 16]. It is, in fact, determined by the 
independent choices of the innovation and the random coefficients. 
However, by specifying the marginal distribution and the random 
coefficients, in NLAR(2) the innovation is restricted more than in the 
RCA(p) model. If the X. in (II.C.1.1) or Z, in (II.C.2.1) have a 
standard Laplace marginal distribution, then all their moments are given 


by (11.B.1.3). "From "MEC. 1. 1) or (TI G 2 EEGENEN 


OS y 2 mana 
2k 2k 2k 
Bes = (er [1 - (aa =a E 
ER f | f (Guise 2272) 
] SES, 2(k-i) 2(k-i) | 
L | (a, 8: 0.85 )E(e anon 5 son 


PEE SEIS EIAS (Cos a) 
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Since a, and a, are probabilities it is necessary that |8,| $ 1 for 


1 


NEENME? roP-(II.C.2.3) to hold. If'not, there exfStS" Por every a, > 0 


and WA 20 an integer m, such that a, B?" or aster is greater than 1. 
We have now established the necessary conditions on the 
innovation le.) and on 8, and 8,--namely that 18, | $ 7,1 =1,2--for 
the existence of a unique strictly stationary solution to (II.C.2.1) 
with a marginal Laplace distribution and with the random coefficients 
given by (II.C.1.2). In the next theorem, we show that |8.| $ ! for 
NEST is also a sufficient condition and that such an innovation 
random variable € exists. We also give its explicit form--a convex 
combination of Laplace random variables. For simplicity, the parameter 
space is regarded as being described by strict inequalities for 
THEOREM icc bet (Xx) be a stationary process with standard Laplace 
pau distribution. For all n, let equations (11.C.1.1) arma 


(11.C.1.2) hold with 0 << |B | < 1, 0 < a, < 1 for i = 1,2 and 


a, * A, < 1. Then 


L. W.D. 12P27P3> 
= = D: sos 
E K b. EBI w.p Po. ue ) 
EI WINES Pas 


where (Ll are i.i.d. standard Laplace variates; the SES have values in 
ie Ios]. |b|} and are independent of (L ) and (KA, Ki for all n. 


They are also independent of X ni^ g-2*777* Furthermore, 
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Po = ( (a, 8? i a 85)b5 Eu a) BF B.) /(b; = b3) (1 = b), Git. C255) 
P3 = ((a, + 0,) MP (a, B. + a,82)b21 /(b3 - b3) (1 - bi, (MTC. 200) 
1 > bå = bis + (s? - 401/27 > 03 = Hs - (8% - tr) 7) » o, (11.6.2.7) 
SE e a,)85, and CLL, C. 2) 
r-(1-2, - 08185. CI C29 


Proof: 
For the NLAR(2) model specified by (II.C.1.1), (II.C.1.2) and 
(II.C.9.4) (LO lo by Cw) and $ Cw) be the characteristic 


functions of tHe {xX} and te} sequences. If (X ) is stationary, then 
by Cw) = $ (o) a, e C8, 0) d aby (Bow) + (170, 7a) J. (TRE O 


Assuming a standard Laplace marginal distribution for (X d, the 
independent distribution of fe] has a characteristic function, possibly 


improper, given by 


(1*8207) (1*8507) 


E rola, a, 82825* * (1740 87. * (i-a BEju* * 1T. 


(ILC 
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tris convenient to Pactor the quadratic Mn d? in he 
denominator of (II.C.2.11). The roots of this factor are both real and 


distinct. To see this, note that 


= 2 es = X. n 
((1 a.) Bi : (1 a) BS}? dé d 05) 8185 


- R LI 202 2952 
Te a) B; (1 a.) 82] * la, a4 8$ 85 > 0% 


The roots are also both negative, which can be seen by noting that the 


= bat 2 2 ° e E 
product ES AL Q 2,05) 81 85 is positive, but the sum r, + r; 


= -((172,) 82 s (174) 82) / (170, 70) 8185 is negative. 


S 21/03» we can rewrite (11.C.2.11) 


using partial fraction decomposition to obtain 


Letting ca -1/05 and r 


1 1 1 
$. (o) E cose eae) * Pa 15222) $ ENET CMR Ca 2a 12) 


Econ ( II. C .2.11) 


2 SES = 2 - < = ° . a 
b; + 93 (1 a4) 84 HADI a.) 85 S (CA. 13) 
and 
22 = (1-a,- Bocas 12072. B 
Dap, = (1-a,-a,)8785 = n Circ ) 
Ponparinewchl.C.2.le2) ane electa 10) term tor Cermiralso yields 
_h2 d QM 2 2 IT 6.2. 1055 
p, (1-b5) + SÉ b3) a,8°, + 0,85 ( 5 
and 


29 


p,(1-05)03 + p4(1752)53 = (a, *a) 8: 85. (iC. eee 


Expressions for b5, PES D. and p, are obtained in terms of a 


and 85: by solvinE (II.C.2.13) = (ill PoS S 


Tet 
B 


Ot] 


(II.C.2.15) and (II.C.2.16) simultaneously uo o nD s EID 
(11.C.2.6). Equations for b> and bš given in (II.C.,2.T)ü5Sre obtained 
from solving (II.C.2.13) and (II.C.2.14) simultaneously. Arbitrarily, 
let 55 be the larger value. 

It remains now to show that the inversion of (II.C.2.12) will, 
in fact, yield a function that is a probability density and is the 
mixture of densities for scaled Laplace variables. To do this, we show 


3 
< 1, we have, after adding (II.C.2.5) 


that Po and Pz can be interpreted as probabilities and that Do * Do SM 


To establish that p. + E 


andi gik O26) 


2 2 E 212 
; _ (a, 87*0,85) (a, +a,) 8,85) nc 
EE AS E? 


Multiplying out (1753) (1752) and using (IIL.C.2 13) and (ITC 2h RE 


have, after some rearrangement 


(17-81) (1782) 


Dot p Kasi Tops DY Ano C RU ERE S MEE E E (II.C.2 09 
2 3 (1-87)(1-85) + a8/(1-85) + a,85(1-87) 


This expression is clearly positive and less than one, from 


which it follows that pətp.<1. 
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To show that P> and P3 are probabilities, it remains to snow 
Pace they are both positive. To do this, it iS Snown that the 
muwerators and the denomínators of (11.C.2.5) and (11.0 .2.6) are 


positive. For the denominators, this requires that 0<b? , b? <1, which 


3 
is shown by noting 0€ (1753) (17b2)€1. From CII.C.2-.0 7) and QMEC.2.198), 


it follows that 


(1-05) (1-b3) (1787) (1-82) * e (i e) + a 851787) OL 


Also, 


E Ba, 2 n2) e = 
1 (1 b5) (1 D3) SE D3) bb 


= 2 = 2 ne m " 252 
( 1 a, ) 81 Em | a) BS (1 a. a.) 8185 


- 2 n2 - 2 = a 2 2n2 
(1-0,)87(1-87) + (1-0,)83(1-83) > 8283 > 0. 


Therefore, Di and bi are Less than one, so Po and P 3 have 


2 3 
positive denominators. 
To see that P5 and P3 have positive numerators, note that it 
must be true that 
2 n2 
(a, +a.) By BA 


Do AAA B2 D CIONES 
3 (a, 81 *a,82) 2 


4] 


Using (II.C.2.8) and (II.C.2.9), (ITO -N iS equ NH 


° < 2b - s < ea 


eom 
on 
s? - > (s-2b)2, 


or 


sb - b* - r> 0. (MEE 2020) 


But the lefthanda side or (IIS 


2 n2 LINA 
2 2I NE2 , 
( a, 8. ta, 82) 


which is strictly positive. 

Therefore, Po and P3 are both positive and PENE Therefore, 
Pos P3 and D can be regarded as probabilities. Therefore En has a 
proper density which can be generated as the mixture of three Laplaces 


with scale parameters 1, LA and |b_|, respectively. QE DA 


" 

The general NLAR(2) model uses the four parameters to achieve a 
wide range of sample path behavior. Figure II.C.2.1 illustrates four 
different realizations of the NLAR(2) process. In each cease, the 
theoretical autocorrelations are identical with p(1) = 0.64 and 
p(2) = 0.5. Also, note that each sample path was generated from the 
same i.i.d. standard Laplace sequence (Li, such that (X. ,X.) E (L.L). 


), 


since this is not the steady state bivariate distribution of (X SX 


the sample paths illustrated in Figure II.C.2.1 are displayed beginning 
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witn X 501 to avoid the initial transient behavior of the process. The 
true value of each parameter is displayed above the corresponding sample 
path. Figure 11.C.2.2 contains the Scavvermpvers =! OF Seale calpain 
Figure II.C.2.1. The sample size in each plot is 2000. 

Many special cases of the NLAR(2) model could be mentioned. The 
following have one or more of the parameters at their boundary value and 
have valid but less complicated results for the distribution of le) in 
MA AO If a, = a, 


coe € If a, = 1 then fe} is the innovation of the LAR(1) model 


2 0 then fe} is the i.i.d. sequence (1) and 


derived from (II.-B8M- 7) and Crus WEL 18, | = LM - | and 
woo < 1] then eaen En is distributed as a scaled Laplace random 
variable, VIE. Loe These models are called the TLAR(2) models, 
which are easily extendable to higher-order autoregressions, as 
discussed in Section II.E. If a, < 1 anda, = 0 or dë = 0, then le) is 
the innovation of the new first-order autoregressive model NLAR(1). 
This model is the subject of Section II pb. 
3. Autocorrelation Structure 
In this section, it is shown that the autocorrelations 


poe Corr (X, ), $ 20,z1,z2,... of the NLAR(2) model satisfy the 


í An-2 
Yule-Walker type difference equations; thus the second moment dependency 
aspects are indistinguishable in form from those for the AR(2) process. 
We also compare the admissible regions of an AR(2) with (i) an NLAR(2) 
with 4 parameters and (ii) an NLAR(2) with only two parameters. 

From the independence of (K ] and {K) S and (IT. G 31 E 


(II.C.1.2) and (11,02. 1er cena. E(K2) = ges ECK?) = 4, and 


1 2 
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ECe, ) = E(K )E(L ) = 0. Multipliying (II.C.l 20) on Toch Sne SRo X -2 we 


Have for UE ee E 


A et M x Gol ol 


n-1 n-& a T 


Dividing by Var (X, ) we have pl-2) = aB PCR - 1) + aB PCA - 2), since 


o(-2) = p(2). Substituting o. B. - a, for i = 1,2 and p(0) = 1, we have 


li 


pC1) a, + a Jp C1) 


1 


p( 2) 


a p(1) * 85, AC) 


which are the same equations as those which occur for the AR(2) process. 

Since |8,| $ ! for i - 1,2 anda, + a, $ 1 in NLAR(2), the usual 
triangular admissible region for AR(2) given in Box and Jenkins 
[Ref. 23: p. 61] shrinks to the interior of a diamond-shaped area in 
(a, = ab; aj = 0582) coordinates: adea EU 
II.C.3.1a and 1b)... In (p(1), 62) } coordinates themen 
p(1)? = (1 + p(2))/2 defining allowable combinations of p(1) and p(2) in 
AR(2) also changes. For NLAR(2), the space in (p(1), p(2)) coordinates 
becomes a triangular region bounded below by | (1) | = SU + p(2)}. (See 
Figures II.C.3.2a and 2b). 

The reduction in allowable parameter or correlation combinations 
for NLAR(2) over the AR(2) model is not large. This encouraged us to 
consider a 2-parameter NLAR(2) model by specifying a, = Bio for i = tyes 
SO that a, = d The parameter space in (84,8) coordinates becomes the 


Se (see Figure 


symmetric region bounded by the curves 85 = + (1 - 87) 
Dp C. EM IN T B) coordinates the admissible region of the two 


parameter model is bounded by the unit circle 83 * 85 = 1. Using only 
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POINT PLOTS OF ADMISSIBLE REGION FOR p(1) AND p(2) 
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pO) 


NLAR(2) MODEL WITH 4 PARAMETERS 
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SO 
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o1 


p(1) 


for pi) and p (2) for 


ion 
Linear AR(2) and NLAR(2) Processes 


ble Reg 


1551 


Point Plots of Adm 


Figure, TI. C 322; 


two parameters leads to the admissible region in Figure II.C.3.2c for 
(p(1), p(2)) space. The (p(1), p(2)) space was obtained by transforming 
the lines 85 7-35 0,-18$0 5$ 1, in Figure II-C.3.]c to p(2) s» 
(1-a,)p(1)*+a,, where |p(1)| s aa ys 84 / (1-82) and 85 - e 
if a, 2 0 and d = E if a, < `0. 

Al the pilots in Fig. I1.C.3.1 were generated Prom 3 grid of 
equally spaced values of a, and a». In Fig. 11.C.3.1a The po las 
satisfy the Yule-Walker equations (5.1). In Figs. II.C.3.1b and ic, the 
Noms also satisfy the conditions of Theorem i. In Fig. I1.C.3.2 the 
feasible combinations of p(1) and p(2) are plotted for those values of 
a, and a, from Fig. 11.C.3.1 using the Yule-Walker equations (5.1). 

4. Directional Moments and Partial Time Reversibility 

In the last section, we demonstrated that the second moment 
dependency aspects of the NLAR(2) model were indistinguishable in form 
from those of the ordinary AR(2) model. Also, it is well known that if 
the linear autoregressive model is not Gaussian, then the process is not 
completely determined by the first and second moments. Thus in model 
identification it becomes necessary to examine third order moments to 


), 


further identify the process. Special third order moments E E 


for all £, are known as directional moments. If the directional moments 
for all £ are equal, which is necessary for a process to be fully time 
reversible, we say the process is partially time reversible in the sense 
of directional moments. 

A process is fully time reversible (Lawrance [Ref. 27]) if the 


uecmrEdistribution of X ° X ne GE , is the same as that for x : 


pe n+r ap 


X ee for all r and for all n. Since LAR(1), a special case of 


n+r-1? 
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NLAR(2), is not fully time reversible, NLAR(2) is in general not time 
reversible. 

In this section we show by induction arguments that all the 
third order moments of NLAR(2) are the same as those for Gaussian AR(2) 
model; i.e. E(X XX) - 0for i, j, k. This implies particularly that 
the directional moments of NLAR(2) are equal and therefore that NLAR(2) 
is always partially time reversible. 

In Section II.B, we found that E(X7) = 0 for all i since X, is 


marginally Standard Laplace. It is easy to establish the following two 


equations: 
2 E 2 f 
2 E 2 a 2 
BOOK 4) * [(820,)/(1 -28,8,0,0,)) E(X_X% _,). (TL. Ces) 


Solving (II.C. 4.) and E 422). simultaneously eds E(X X 4) - 


2 SS 
E EE 


Now, using separate induction arguments and the stationarity 


assumption, we establish that E(X Xi- Ja- for -alll J Gana 


2 
: = r 2 
E(X X = ) O to all k > T. 


The proof of E(X X^ ) = 0 is straightforward. 


L 


To prove E(X X. y? = 0, we first show that the expectation of 


special third order moments of the form X XL LAS for Kk 2.2 15 zeros 


Def i - : - PRA 
efine u, E(X X _,X 212? and assume EE opm es 1 From 


CT ies. 
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E EX z E 
Hr n O aB EX Xn- (k-1)? S aB ECX X 


SE 


_ k-1 
= ABU, x eins (a>B.) hy > (EEC UE) 
ew trom (11.C.4.1) and (11.C.4.2), we have 


= 2 E 2 > = 
H. AA a JBAE(X?X, 4) 0. Therefore D, On 


We now proceed to show that ECX XX) = O) Fom all i, j; Kk. 


Gëft loss of generality let i < j < k so that k "i *n, j^ i*m 


amd n » m. Therefore by stationarity E(X.X.X,) = E(X.X. X. _) = 
IN K 1 lem i*n 


E(X,X, 
i 


X D Fix m so that 0 < m < n we use induction on n. 
wam) in 


Let n = 2, implying m = 1. The first step in the induction follows from 


E(X X, _ X ) = = 0. Wext assume that for m <n § K, 


a 


E(XTX, 


X ) = 0. Now we show that E(X.X. ) = O. 
i i-(n-m) i-n i 


ión aa 
Using (II.C.1.1), we write 


WA eo 8 0185 EU am arsi 
poh Ee etiam) *1- Geen) 
"Ee Xa e en pacer) D 
E CARRY) + EA oo Ope (p 1)) 9-9 and 
EQG mA (+1)? - EO Xi (ag) X4 - 0 by stationarity and 


) 


the assumption. Likewise cr a AA) 


E(G X; ) = 0. This completes the induction. 


Bc emn NO) 


An immediate result from the argument about third moments is 


that Z = X - X for (X ] of the NLAR(2) is not skewed. 
n n nel n 


al 


The residual analysis in [Ref. 6] and [Ref. 22] using cross 


correlations between linear autoregressive residuals R. = X = aX iy 


X and their squares R^, does not shed any new light on the 


SRI 
directionality/reversibility in the NLAR(2) model or help in identifying 
the appropriateness of the Laplacian model. This is because all third 


) 


moments have zero expectation. Thus, we see that B CRUS ee 


E(R R^ |) = 0 for al) Yi 

Note that the basis for the residual analysis using the (RJ 
process is that this process is uncorrelated but not necessarily 
independent. The moment results show that the RAS have zero skewness. 
In fact, it is easy to show that the distribution of R is the same as 
the distribution of E» Thus the R 'S are symmetric although they 
will, of course, not have Laplacian distributions. 

In Chapter IV of this thesis, a residual analysis based on 
certain fourth-order moments is presented. 
D. THE NEW LAPLACE FIRST-ORDER AUTOREGRESSIVE MODEL, NLAR(1) 

i. intreductien 

The new Laplace first-order autoregressive model is another 

special case of the NLARMA(p,q) model when q=0 and p=1. This is, of 


course, a special case of the NLAR(2) model, where either a, and/or 8 


2 


are zero in (II.C.1.1). Examples of the different sample path behavior 
obtainable from the NLAR(1) Process are given in Figure II.D.1.1. Note 
that each sample has the same value of lag-1 serial correlation, i.e. 


p(1) = Corr (X ,X. ). In Figure II.D.1.2 are the corresponding scatter 


e 
plots for the samples in Figure II.D.1.1. In the seatter plot labeled, 


"a51", the distinctive regression line, Xp “ex, is clearly visible 
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for the LAR(1) process. This is produced as explained in Section II.B, 
because the innovation, e,» can be zero with non-zero probability. 


The two-parameter autoregressive model generates an (X ) 


sequence which satisfies 


= t 
MS * Es (CUAD 10 
= 1 
KA i Mas 

where 

1 W.p. a, 1 W.p. 12p, 
Kt z ; Ko Z CI] D51202) 

O  w.p. 1-a, d 17a, |B,|L, w.p. Ps 
and 

- 2 = E 2 


Also, (Kit, (K), (Lu are i.i.d. sequences independent of each other 


and independent of X1? Kadi nene 

From (II.D.1.2) and (II.D.1.3), we see that the inversion of the 

; = |Z e 
characteristic function for e, letting A = (1-a,) * (8, ) , gives 
for O<a,<1 
O o EE 
i Ge) = 5 ° |x| ae e Ax]. (LISBe.5) 
n 


which is a convex mixture of Laplace densities. This result also 


follows directly from Section II.C.3, since the NLAR(1) model is an 


5.5 


NLAR(2) model. Likewise, the correlation structure and partial time 
reversibility in the sense of directional moments are the corresponding 


results for the NLAR(2) model with a.=0 or 8570. That is 


2 


Corr (X ,X - = (ag) IKI Por all k =" 0," 1 $520 pP EN KEN 


k) 
and 


2 = 2 = = a eVete 
E(X Xek’ = E DEPO a kx 0, +1, +2 GIT EE 


We can rewrite (II.D.1.1) as 


n a 
" J 
X =£ + L B4 om ese (Tip h 


From (II.D.1.1)9 9m" PsC cbear that X, depends only on X, , and e,. From 


(II.D.1.7), we see tnat X is independent of E, for all k20. Hence 


-1 +k 


is a first-order Markov process and starting Ko with a standard 


Laplace distribution makes (X, ] stationary. 

The remainder of this section is devoted to specific results for 
the NLAR(1) process which have not been shown in the more general 
NLAR(2) model. The extension of these results to the NLAR(2) process 


would require Tie) cine, di seae pul TONO (XX 4X ols which has not 


been derived. mihe C ondit ron deus aoe X, giyen om is derived als 


1 


well as an expression for the joint distribution of the X ° The 


ELSE EECHER dure Zu? is also derived: EEN 


estimation is discussed in the context of moment estimators and least 


squares using the linearized residual. The problems with finding the 


maximum likelihood estimators of d and d are also addressed. 


2. Conditional Density and the Joint Density of (Xo pee eX) 


To find the conditional density of Xa? given Be we use 


1 LU 


). We have for «.«1, 


(II.D.1.1) ^ (II.D.1.8) to evaluate P(X <x |X 


nel 


which eliminates the LAR(1) process, 


P(X, €x, |X Sa 


= t 
n=1? PCR BX ne] ` Epi ) 


u 


a,P(e «x, 7B.x, T + (17a, )P(e €x) 


on 


Seo x x 
Dens] n 
= a, J DE COSS sc | f(x)dx. (O 
n n 


ErueDentiating (II.D.2.1) with respect to X. yields the following 


expression fon datt 
p. Ix. (x, [x, 2) E Mc soe ). Ce Ore es) 
Ma n=l n n 
Examples of (II.D.2.2) for a fixed X=, and fixed Y = a,B, = .64 are 


given in Figure II.D.2.1. 


Now we can write the joint density f (X, X AD as the 


X X . n 
nens! 
product fy | opas Oe (ail: In fact, the n-dimensional 
no css mne 
dgeScPIDbution of NS is obtained using this product recursively to 
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obtain the density 


Leg 1 me lee s erer 
n 1 nne pelin 
Ss |x pos (ADO) 
2] 1 
3. Distribution of Differences and PX) 


Ncenow consideri tene distribution Of the dirfrerence Za = X X nit 
uus II.D.1.1) * CII.D.T1.4) and the fact that En is a convex mixture 
of Laplacian random variables, we used partial fraction decomposition to 


invert the characteristic function of Zu to obtain the following 


expression for the density: 





a4 C178.) Po (17 p5) 
eni us | 


t; n = exp(- |y |/ (178,2) ]—; VITE UOCE ETE TTE 


a (17a,) 


1 
E exp(<|y|/o) (op,/2) E = (1-8 )2 " 1^6? 
1 





(17a, )p, (17a, ) (17 p5) =: 


+ dexp(=|y]) aot = , ————— - —— 
2 1^9 2 8, (278,) 


* (17p5) C72) |y |expCo |y 75, (Ci Deen 


where g? = (17a, B1. 


29 


One immediate result is that f (y) is symmetric about zero and 


n 


therefore, P(Z <0) = P(Z >0) = 1/2. This demonstrates one additional 
feature of the partial time reversibility of the NLAR(1) models; i.e., 
probabilities of a run down WA and a run up A are the 
same. To evaluate probabilities of higher order runs would require the 
joint distribution of the sequence (Z ). This result has not been 
obtained for the NLAR(2) model. 
4, Estimation of Serial Correlation 
a. Introduction 
The purpose of this section is to present estimators of the 


two parameters a, and B whose product is the correlation coefficient in 


1 


the NLAR(1) models. We assume throughout this section, unless otherwise 
stated, that (X, ] has a standard Laplace (p=0, A=1) marginal 
distribution. Estimation ofru anda ue models that have marginal 
Laplace distributions are discussed in Chapter III. We also only 
consider the random coefficient models of the NLAR(1) process, i.e. a<1, 
thus eliminating the LAR(1) model. As was shown in the introduction to 


this chapter, for a,=1, 8, can be estimated very efficiently, thus 


1 1 


eliminating the need for further discussion. 
The method of moments is used first to find an estimator of 


The joint moment estimators of a, and 8, are calculated from 


i 1 1 


fourth<order moments. These estimators are used later in an iterative 


procedure to obtain the joint least squares estimators of a, and By. 


1 


A least squares estimation procedure is defined for the 


NLAR(1) models using the usual linear residual H. = A,0,8 X 
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Minimizing the sum of RA leads to the usual estimator of Y as given in 
standard texts on time series. In order to estimate a. and B 
individually, we minimize the square of a particular function of R with 


respect to a, and Zu 


1 
In the last part of this section, the problems of maximum 
likelihood estimation in the NLAR(1) process are discussed. Although no 
results are presented for the general model, the maximum likelihood 
estimator of the correlation coefficient in the TLAR(1) model is given. 
b. Method of Moments 

(1) Estimation of Y by Second<Order Moments. Since X, is 

assumed to have a standard Laplace distribution with ECX, ) - 0 and 


Var(X? = 2, an immediately obvious choice for estimating 


Y = Corr(X ,X 24)? is the following product moment: 


, n 
ze DAI 
Y = (CTT ORA N) 
Taking the expectation of Y and using (II.D.1.1), we have 
A | n | n 
e O TE L e E s > A 


so that the estimator is unbiased. 
(2) oint Estimation of a, and B by Fourth-Order Moments. 
The expectation of fourth-order moments can be calculated using 


(II.D.1.1) and the fact that (x ] is a stationary process. For example 
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ENCE Oa = Cas, (254,383) > EEN 
E(X2X? |.) = 4(1+50,87) ; (II.D.4.4) 
E(X X?) = 24a,8, ; (il. Dade) 
BEE Ho 8. (1*20, 81 *3a, (27a, ) B1). ID 


Solving foroa, and B, in different pairs of these 


1 
equations gives the estimators based on fourth<order moments. It is to 
our advantage to use the expressions with the lower order moments where 
possible. Therefore, using E(X Xia) and HE E instead of 


EX X 4 ?> we solve for the following expressions for the joint moment 


estimators of a, and B, 








n 
B i: 
A i. M 
a, = - ; (TISS 7) 
(p x x SW (ec? 
122 
n 
) (X1X; 4) "w n T 
e Dm 
B, A S P T (IL: Doo 
10 ob KX, 


From the scatter plot analyses in Figures II.D.4.1 and 
II.D.4.2, we see an example of the behavior of this pair of estimators 
when Oe B. = ,8 in the NLAR(1) model. Both scatter plots contain 500 


pairs (a4, 8.) derived first from samples of size 250 and then from 
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samples of size 2500. It is clear from the equations (II.D.4.7) and 


^ ^ 


EnmD. 4.8) that Y - a,8,. The hyperbola can be seen in both scatter 
plots. Both parts are visible for sample size of 250. However, for 
pairs derived from samples of size 2500, only the part in the first 
quadrant is visible. 

From the Normal probability plots in Figure II.D.4.3, 
there is little evidence of non<Normality for Y = aB, Form e 250, and 
less for the estimator derived from samples of size 2500. However, 


^ ^ 


individual estimators A, and B, look far less Normal for both sets of 


sample sizes. 
c. Least Squares Estimation in the NLAR(1) Process 
(1) The Linear Residual. The properties of the linear 
residual are developed for use in deriving the least squares estimators 


of Y = a, . and for a, and d jointly. We begin by rewriting (II.D.1.1) 


1 
in the RCA(1) form as given in (II.C.2.1). We nave 


)X te. (IADS 


= UE 
X a, B X d B, (Ke 4 )X Li 2 


n TE 1 


From this expression, there are clearly two ways to write down the 


linear residual, R ° The usual one from linear theory is, of counse 
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However, a particularly useful way of looking at it is from 


= t= 
R B (KE a 


" E Fev, (II.D.4.11) 


RIS mn 


It is from (II.D.4.11) that we see explicitly how the i.i.d. innovation, 
e), and the coefficient (Kr 7a) processes impact on the linear 
residual. 


Let A. be the o<algebra generated by LECK 70.) €, 5; 


1 K 


EIER —m-11]. Int udet i ved ys Au. , represents all the information 


about the process up to time n<1. Conditioning on £ we have the 


n-i ? 


following two useful properties of R, as noted by Nicholls and Quinn 


Mes. 16: p. 42]. 


pm d 


ECR | Ana ) = 0. (Aa) 


2 2 t 2 
E(R?| 8iVar(K*)x*? , * Var(e,) CHIA) 


> 
4 
— 
lI 


il 


< 2. 2 Le, 2 
a, (1 B Xr + 2(1 a B7) 


1 (lab ala qh) 


These results follow because e is a function only of 


1 


the process through (n<1) and (Kata) and E, are both independent of it. 


(2) The Least Squares Estimator of Y = a,8 Using R. from 


¡AE 


(11.D.4.10) and a given sample from UL, we obtain the least squares 
n 

estimator by minimizing the sum ) R? with respect to the product aß, 
122 


which is now called Y. We have 
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2S 
< 


L i=] 
; CII.D. 495) 


—< > 
lI 
= 
N 


PS 
F^ Ñ 
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H Gil Ei 3 


P. 
Po 


which, in fact, is the usual expression forothevcestimatilon ofasenbst 
correlation in linear AR(1) models as given, for example, in Chatfield 
(Ref. 31: p. 66]. 

Since the NLAR(1) process is an RCA(1) prooess of 
Nicholls and Quinn, it follows from their theorem [Ref. 16: Pp. 44] that 


3 1 


Y is strongly consistent, asymptotically unbiased and (Y=Y) has an 
y N 





asymptotic Normal distribution. The asymptotic variance, from the same 


results of Nicholls and Quinn, is 


da = 1 + 50,87 = 6(a,8,)*. (MAD. 16) 
a 171 


Figures 11.D.4.4-11.D.4.7 contain the boxplot analysis 


of SIMTBED [Ref. 15] output for selected choices of a, and B, in the 


1 


simulation of the least squares estimator of the product aß, in the 
NLAR(1) processes. Note that although the estimated asymptotic mean is 


the true value, Y = a,8, = .64, for each of the four sets of the 


parameters, the estimated asymptotic variance of the estimator of 


B = Y is different for each of the four different sets of 


ES 


1 
parameters. The simulation results reflect the asymptotic theoretical 


results for the NLAR(1) processes as given above. 
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An analogous result is given in Section III.E.4, where 
the theory of least squares is derived for the Beta*Laplace AR(1) model. 


(3) The Joint Least Squares Estimation for a, and B, NC 


] 
n 

is not possible to minimize ) R? with respect to a 
ER 


ually. However, a technique from Nicholls and Quinn (Ref. 16: p. 43], 


1 and B individ- 


which uses the result in (II.D.4.13) is applicable. As was pointed out 
earlier in Section II.C.2, by assuming nothing about the particular 
marginal distribution, Nicholls and Quinn were free to treat the 


variances, d and oi as completely independent parameters subject only 


Ke 


to the constraint that the marginal distribution of (XJ, whatever it 


is, has a positive variance. Then, given (II.D.4.13), it was possible 


n 
to estimate dë and On by minimizing the sum of squares ) Ek where 
ER, 


^ 


S == R2 = g? ^ g2.X? 


D sls 
n n € Xo cepe ee WA 


ie 


and YA = (X47 YX, e and s from TAD. 5). -They derive the 


properties of the trivariate distribution of the estimator of 


Gye 92, PE 


2 


Since o: andá 
€ K 


are related parametrically in a, and 


B the results in [Ref. 16] concerning the variances do not apply in 


1 $ 
the NLAR(1) process. However, we can form from (II.D.4.13) and 


(II.D.4.10) an analogous expression for 


e 


T 2 = 2 ES 2 = a 2 
S. = R^ o4 84 (1 a4 )X* SN % 81), (11.0.4 T 


where the product a4 B. in R. is not replaced by Y from (II.D.4.15). 


In terms of a sample from po we define the joint 


and d to be those values a 


least squares estimators of a and d that 


1 1 


minimize 


n 
L ((x, ^a, B, x 


2 «= = 27772 
SEH aa 


= 2(1=a,B8,)}*, (Tipe d= ton 


ISi 1 


i 
where (II.D.4.19) is the sum of the squares of S, given in (MIDA SA 
Now it is clear that (II.D.4.19) is a highly nonlinear expression in two 


unknowns, a, and By: A given numerical technique could converge to a 


1 
local extremum, a saddle point, or diverge depending on, among other 


things, the starting values for estimating a, and ZE 


1 

Constraining the nonlinear optimization problem given 

by (II.D.4.19) to the rectangle within which the NLAR(1) process is 

defined--0 S d S lands s B, Ss 1<<eliminates the divergence problem, 

but clouds the estimation issue regarding the boundary models LAR(1) and 
TLAR(1). We try an unconstrained approach described below. 

(4) An Uneonstrained Nonlinear Optimization of (I1.D.4.19). 

It is easy, but tedious, to write the normal equations from (11.D.4.19). 

One critical point is at a, = gd - 0. After factoring a. from the one 

equation and d from the second, several iterations of the Newton- 


Raphson method (see, for example, Gerald [Ref. 28: pp. 122=128]) can be 


performed to find other critical points. The Newton-Raphson method uses 


74 


a second<order Taylor series approximation to solve the non-linear 
system by a set of linear Jacobian equations. However, one needs to 
Calculate the four second partial derivatives from (II.D.4.19) and to 
have a good starting point on the surface. 

The IMSL routine ZSPOW solves systems of non-linear 


equations for one root using modified Newton methods. This routine was 


used to solve the unconstrained problem of finding a, and B. from sets 


1 


of data from simulated NLAR(1) processes. The routine was very senstive 
to starting values and did not always converge even when the sample size 
was as large as 2500. It also did not perform well when the true 
correlation coefficient, Y = d was small for any of the simulated 
NLAR(1) processes with the same autocorrelation function, yl*l. This 
problem is highlighted by the fact that (II.D.H.19) is constant along 


the line a, = 0 and the line B. = 0. 


As an illustration of the performance of the routine, 
500 sets of sample sizes 250 and 2500, respectively, were generated from 


the NLAR(1) process with a, = d = ,8. The scatter plot analyses in 


A 


Figures 11.D.4.8 and I1.D.4.9 show how the estimators a, and B. 


1 


determined by ZSPOW are related. Especially for the samples of size 
250, there is the same pattern of the hyperbola as seen in the moment 


estimators of a, and d EE e: EE EE tee br 2, From the 
accompanying tables, it is clear that the variance of the marginal 


^ 


distributions for each estimator a, and B, is decreasing with increased 


sample size. The Normal plots of the empirical marginal cumulative 


dmsturibution functions for d andi for B, appear very non*Normal even 
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from estimators derived from samples of 2500. On the other hand, the 
Normal plots of Y = On indicate that the distribution is converging to 
a Normal distribution as required by theoretical results of the 
previous subsection. (See Figure II.D.4.10). 

It is convenient, at this point, to summarize the 


8, and 


results on the moment and least squares estimation of Y = a , 


1 
(a,,8,) in the NLAR(1) processes. 

In the estimation of Y, only second=order product 
moments are required for both methods. From the Normal probability 
plots in Figures II.D.4.3 and II.D.4.10, it appears that both estimators 
of Y are converging to Normal distributions. Although the moment 
estimator of Y is unbiased (the least squares estimator is 
asymptotically unbiased), the variance of the moment estimator of Y is 
considerably larger than that of the least squares estimator of Y. 

The estimation of d and B requires fourth<order 
product moments for both methods. The variance of the moment estimators 
of a, and d are too large, even for samples of size 2500 to be useful 
in distinguishing between NLAR(1) prooesses. The least squares 
estimators of d and B have smaller variances than the corresponding 
moment estimators and could be useful in distinguishing between NLAR(!1) 
processes. However, as pointed out above, the numerical routine to find 


the critical points does not always converge for a given starting value 


of a, and Bi: The conclusion is that neither method of estimating a 


1 1 


and gd is very satisfactory. 
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(5) The Median (X; /X; 4) Estimator of Y - a,8,. The median 
of (X; /X; 1) was seen to be extremely efficient in the LAR(1) process. 
It also makes sense in the context of maximum likelihood estimation in 
LAR(1). This is discussed in the next section. 

Simulation results confirm the conjecture that the 
median (X; /X; L4) is not a robust estimator of Y for departures from the 
LAR(1) process. In fact, from the bOxploUvs in BEES [le 1 | 
11.D.4.14 of SIMTBED output for four NLAR(1) processes, the estimators 
seem to become more biased as B. approaches one--corresponding to the 
other boundary process, TLAR(1). Even for the small size of the 
simulations, the standard deviation of the mean is small. For the three 
non-LAR(1) models, the asymptotic estimates of the mean of Y given in 
the data are each SNO oia from the theoretical value of 
Y = .61. 

d. Method of Maximum Likelinood 

(1) Introduction @eeiwe logarithm of. ti emaemded i 

function, Lla,,8,), is obtained by taking the natural logarithm of the 


n-dimensional joint density given in (II.D.2.4) and treating it as a 


function of o. and B. for a given realization of length n from (X ). We 
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have 


L(a,,B,) = -n(%n2) - |x. | * 


H Ceci 


ana, (1-p,)exp{-|x,-8,x,_,|} 


i=2 


+ (1-a,) (1-p,)exp{-|x, |} + o pjAexp(-A|x,-8,x, . | } 


1 
* (17a) p4Aexp( -A|x, [1], (II.D.5.20) 


1 
where p, was given in (11.D.13) and A = / (17a, ) 8$ : 


Maximizing (II.D.4.20) in the general NLAR(1) model is 
not accomplished here for two reasons. Hirst, L(o 8.) is Bot 


differentiable with respect to 8, at any of the n values gë, = x,/X, , 


for i = 1,...,n, because of the terms |x,-8,x A bivariate search 


iab 


routine that does not use derivatives is needed. 


Second, L(a, , 8.) is not defined along the line a, = 1 


muro Or 0 £ $n values of B. EIERE < d = IT, w 


see this, examine the third term of the natural loganithm in 


(II. D.1.20). We have replacing A for all i - 2,...,n 


p — | 
Las exp EE Ç (i ps nol) 


AER NIE MIETEN 


Q 


Because of the presence of the exponential term in (II.D.H.21), the 


limit as a, approaches one is zero, so long as B, * X. Xi ou The limit 


ies. 2 ee 


does not exist on the set B = BEE à X. /X, 5 
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It is worth noting tnat krog a, = 1, corresponding to 


the LAR(1) model, and exeept on the set B, (II.D.4.20), can be written 


as 
LAS Ix,] + (n-D£nC1- 82) 
n 
- P PSOE. Cox (Tipo o, 
122 
Now in(1-84) is maximized at d - 0 and the optimal value for 
n 


) |x,-8,x,.,| is the least absolute deviation (LAD) estimator of 8, 
1-22 


which is the weighted median of (x, /X; 4) where the weights are Xi: ron 


1 = 2,...,n. Thus, if after a large number of observations from (X ) no 


Ipo Ir X, / X, _ are observed, tHemwthece wi OS 


1 


between a particular LAR(1) model and the completely random model of 


l.i.d. Laplace variables. In this case, for any gd in a small deleted 


neighborhood around 8, = med(x./x. ,), (II.D.4.22) will be large because 
1 i= 


1 
n 


both %n(1-87) and ) |x,-8 | will be optimized. 
i=] 


1*1-1 

(2) The Maximum Likelihood Estimator of a, in the TLAR(1) 
Processes. In this section, the likelihood function for the TLAR(1) 
process is desoribed. The maximum likelihood estimator is found using a 
numerical iteration scheme. The properties of the estimator are 


investigated and compared to the least squares estimator using 


simulation. 
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For the TLAR(1) models (8, = 1 or d e -1), (II.D.4.20) 


can be written as a one-dimensional function of the a variable a. We 











have 
n SISI 
Paani) -= Ix, | + ) Sn exp | | 
ER vi=0, viva, 
+ Vi-a, exp} | š KSE e) 
dc 
where 
X, T Xi] ica (oir 
V. = (11.D.4.24) 
i 
O a < 0, 
L ISi 
-1<a<1 and a, = [al 


Now L(a) is continuous everywhere in the open interval 


(-1,+1) and differentiable everywhere except at a = 0. The expressions 


dL (a) TE ae Gay) 


ren da da?” 


are lengthy and cumbersome to use; hence are not 
given here. 

Examples of the likelihood eurve are given in Figures 
11.D.4.15 - 11.D.4.18. Each curve was generated from a sample of 100 


from a simulated TLAR(1) process with the stated a, and B. It is easy 


1 
to see the non-differentiable point at zero and how flat the curve is. 


To see that there is a maximum (annotated with x on the figure) in these 
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curves, the second part of each figure focuses on the function near the 
true value of E 

The IMSL routine, ZXLSF, a one-dimensional search 
routine was used to find the value of a that maximized (II.D.4.23). The 
starting value a was the least squares estimator of serial correlation 
givenwby WEET.D. 3. 1595 

Using 500 samples of sizes 50 and 500, respectively, 
from simulated TLAR(1) processes with Y = .64, the scatter plot analyses 
in Figures II.D.4.19 and II.D.4.20 were completed. The least squares 
estimator and maximum likelihood estimator appear to be correlated. 
From the accompanying tables, the maximum likelihood estimator appears 
to have a smaller variance and bias than the least squares estimator. 
Analysis of the boxplots from a SIMTBED comparison of the least squares 
estimator and the maximum likelihoood estimator reflect the same results 
(see Figures (II1.D.4321 — 11.0. 4.22). 

From the Normal plots given in Figure II.D.4.23, both 
the least squares and the maximum likelihood estimator appear to be 
coverging to a Normal distribution. There are three or four outliers in 
the tailkout ot 500 pornts: 

E. OTHER CASES OF THE NLARMA(p,q) MODEL 
1. In production | 
A primary advantage of the NLARMA(p,q) model is the ease with 
which the basic framework can be altered to cover a variety of different 
dependency structures. The NLAR(2) and NLAR(1) processes have been 


examined closely in the previous sections of this chapter. At this 
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time, the moving average first-order model, NLMA(1), and the mixed 
model, NLARMA(1,1), are briefly considered. The correlation structure 
and parameter space are discussed for each model. 

The TLAR(1) model for which the maximum likelihood estimation 
was completed, can be easily extended. As the final part of this 
section, we present the p order autoregressive processes for arbitrary 
p 2 2. The conditions for existence and uniqueness, the correlation 
structure and likelihood function are given. The maximum likelihood 
estimation scheme for the p parameters is also discussed. 

2. A Backwards MA(1) Model, NLMA(1) 
a. Correlation Structure of the NLMA(1) Process 
From (II.D.1.1), we see that X is the random coefficient 
sum of independent variables each of which have a marginal Laplace 


distribution. Therefore, we can replace X by another Laplace 


=| 
Variable. If it is independent of L and has a standard Laplace 
marginal distribution, then by the construction, X, Will still have a 
standard Laplace marginal distribution. 

in (MED ARA). ue 


If we replace X in fact Mba L. 


EU E 


obtain the following expression for X 


= 1 
x, = Keb + KL (TIME 284) 


where {Ki} and (L,) are as given in (II.D.1.2) and UM is the 
corresponding two-valued discrete variable as given in (II.C.2.4) for 


the NLAR(2) model. 
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Since X. is by construction in (II.E.2.1) independent of 


-k 
X, for |k| 2 2, we see that the model has the cut off property of a 
linear MA(1) model. The maximum range of correlations in any MA(1) is 
less than or equal to |1/2|, (Fuller [Ref. 29: p. 62]). This range is 
achieved by the linear MA(1) models. Some of the random coefficient 
MA(1) models have been shown to have a maximum range for the 
Corr (X Xn) to be strictly less than one-half (see Hugus [Ref. 30]). 


Using (II.E.2.1) recursively, we derive the serial 


correlation in NLMA(1) as 


Cov(X, ,X, 4) 
Corr(X, ,X, 4) foc ccc ec a 
Var (X ) 
n 
t 
ga BA D)! 
= 5 i 


gafi 4 





_ n-1 
S - ; 
a, 8, 
= t 
= —%— FLL), CK) 8,X.- tK. baa)» 
= aB ECK ,). (II.6. 212) 


Substituting in the values of the i.i.d. sequence (K, ) with the 


corresponding probabilities Po» 1-P> from (II.D.1.3) we have 
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Corr OG X 
mn 


21) = 9,8, ((1-p,) + Y(l-a,)87 po) 


1-87 + a,82Y (1724) 81 


`“ comm ND NEN C T- 


Figure II.E.2.1 is a contour plot hihi ye Wa UASI 


p(1) = Corr (XX, ). Notice that in this model, the correlation is 


=] 
restricted in range over that of the linear MA(1) models. Using the 
IMSL global constrained optimization routine, ZXMWBaewt1thumubtriphe 
starts, the extremes for lag-1 serial correlation are 1p (1) | < 0; 102655 
occurring at a, = .903 and P, = £.690. In Chapter III, we give a 
continuous random coefficient model with MA(1) correlation structure, 
Laplace marginal distribution, and the full range of correlations, i.e. 
BERENS 
b. Invertibility in NLMA(1) 


It is well known (Chatri lda Re m e Ri s Ko cui 
X = Z * B.Z (UE 2 uN 


is a linear MA(1) model, then substituting (1/84) IO d does not 
change the autocorrelation function. This implies that the linear MA(1) 
model is not uniquely determined by its autocorrelation function. 

It is also well known (Chatfield [Ref. 31: p. 43]) that by 
successive substitution, the MA(1) model in (II.E.2.4) can be written as 


the infinite autoregression 
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Likewise, if 1/8, is in (II.E.2.4), we have 


X + — X =. (LE ES ON 
1 


Unfortunately, only one of the two processes given by (II.E.2.5) and 
(II.E.2.6) yields a convergent power series depending on whether 


|8,| < 1 or not. Hence, the restriction on 8, called "invertibility" by 


"| 
Box and Jenkins [Ref. 23: p. 5005 guarantees a one to one 
correspondence between a linear MA(1) model and its autocorrelation 
function by restricting d to be such that the MA(1) "inverted" infinite 
autoregression is the one with a convergent power series representation. 

This definition of invertibility is not totally applicable 
to random coefficient models (such as NLMA(1)) with MA(1) correlation 
structure because it has not been established that there exists a 
corresponding infinite autoregression model. 

Likewise, there can be an infinite number of models that 
have the same autocorrelation function and marginal distribution. This 
is the case in NLMA(1). As was seen in Figure II.E.2.1, each contour 
line corresponds to a constant value of p(1) and is achievable by an 
infinite number of combinations of (04, 8.) . 

The purpose of this section then, is to find a different, 


but meaningful, way to restrict the (o, , 84) rectangle in Figure II.E.2.1 
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which: (1) does not further restrict the range of p(1); and (2) which 
within the region the NLMA(1) model must be uniquely determined by p(1) 


and either a, or B, > 


1 
From the contours in Figure I1.E.2.1, it appears that the 
feasible region for p(1) can be partitioned in such a way that the two 
goals stated above can be achieved. It is not known, however, if this 
partition can be described analytically. Figure II.E.2.2 is an 
illustration of the partition into a center region and two complementary 
disjoint regions. The center region is roughly defined as the region to 
wow? pa rot a line from (-1,.667) to (-.577,1) and to the left of a 
line from (.577,1) to (1,.667). Both lines cut across the contours in 
the depression on the left and on the ridge on the right. The center 
region is more advantageous for two reasons. First, p(1) is a 
continuous function of a. and B. in the center region. Secondly, the 
parameter estimation is more likely to be easier if the most extreme 
values of a. and B. can be avoided simultaneously. Therefore, we shall 
call the center region of Figure II.E.2.2 the "principal" region. 
3. A Mixed Autoregressive-Moving Average Model, NLARMA(1,1) 

From the theorem in Section II.C.2, we see that any two 
(possibly dependent) Laplace variables can be combined with an 
independent set of (again, possibly dependent) Laplace variables to form 
another Laplace variable. Using this property, if we replace Aa. in 


DROCK by La , then the marginal distribution of (X, ] is still 


E 


standard Laplace. We have then 
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- ' " 
S O sos nn nu n, (ID. E.3.i) 

1 t ` d 
where STAD IL, (K ) are as previously defined. 

Notice that if K is identically zero, corresponding to a, = 95 
we obtain an expression of the form given by (II.E.2.1) for NLMA(!). 
Likewise, if Ko is identically zero, we have the NLAR(1) model as given 
W TT D.1.1). 

The NLARMA(1,1) model has the same correlation structure as the 
linear mixed model ARMA(1,1). Using (II.E.3.1), 

E a B EXAT, ) * a85E(L ) 


n-] n-i E 1 


E EX - KL). (II ERZ) 


BUC" X gd and L. are independent so 


anid 
E(X X _,) = 20,8, + a,8,f0,8,E(L, {xX 5) 
2 
+ aB ElL _ L 5) + E(LA jK TIL (IE seo) 
Conditioning on K, ,, using the independence of 2 and 
ON; lass) and dividing by the Var (X ) we have 


p(1) = a,B 


8, + 9,8,C01-p,"p3*|b)]p>+]b31P3) > (II.E. 3.4) 
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where p, p, |b |. | are defined in (II.C.2.5) thromuz m rr e E 


zl, 
For 2 2 2, (II.E.3.3) and (II.E.3.4) become 


E(X X e (lt Basa) 


n-g) = %48 ECX 


and 


o( 2) a. B. p( £71). TSE ÓN 

These equations are the same as those of the ARMA(1,1) model 
(see Chatfield [Ref. 36: p. 58]). However, the range of correlations 
is significantly reduced over that o£ ARMA 11). ELQUI EN 
represents a side-by-side comparison of the (p(1), p(2)) space for 
NLARMA(1,1) and the familiar linear ARMA(1,1). Although p(1) can range 
from -1 to +1, the combinations with p(2) are severely limited in 
NLARMA(1,1). The minimum p(2) in NLARMA(1,1), found numerically using 
the reduced gradient method is approximately -.025 at p(1) = +.2. As 
lp(1)| increases, p(2) approaches p(1)?. 

4. Higher Order Autoregressive Models, TLAR(p) 
a. Introducvion 
It has been stated by Raftery [Ref. 32] that there exists 

NEAR(p) models for p 2 2. Also, Nicholls and Quinn [Ref. 16] have given 
conditions for the "existence and Uniqueness, Strict Stationary, ete; 
for general RCA(p) models. However, only for the NEAR(2) and the 
NLAR(2) processes has it been shown explicitly what the necessary 


innovation is; and that it is a valid random variable. 


106 


POINT PLOTS OF ADMISSIBLE REGION FOR p(1) AND p(2) 


b: NLARMA(1,1) MODEL 


a: LINEAR ARMA(1,1) MODEL 
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p(1) 


A 


point Plots of Admissible Region for p(1) and p(2) for Linear ARMA(1,1) and 


NLARMA(1,1) 


Figure II.E.3.1. 


Processes 


For p 2 3, this has not been accomplished for the general 
NEAR(p) process; nor is it done now for the NLAR(p) process. However, 
there are 2P different pn order autogressive models with p parameters 
that are special cases of the NLAR(p) process. These models are called 
the TLAR(p) models. The innovation for the second-order model was given 
without proof following the theorem in Section II.C.2. The likelihood 
function and maximum likelihood estimation of d was given in Section 
II.D.H for the TLAR(1) processes. 

The TLAR(2) models, including the two TLAR(1) models only 
account for four of the infinite number of NLAR(2) models which all have 
the same AR(2) correlation structure and standard Laplace marginal 
distribution. Since there is a variety of different sample path 
behaviors obtainable in the general NLAR(2) model, it is possible that a 
TLAR(2) model will not always be the most appropriate model for a given 
set of data. 

However, as is shown in the remainder of this section, the 
TLAR(p) models have an advantage over the general NLAR(p) models. The 
TLAR(p) processes for p 2 3 exist; are easily constructed; are partially 
time reversible; and are parsimonious with respect to parameters. The 
parameters in the TLAR(p) process are easily estimated from the 
conditional likelihood function by the method of maximum likelihood. 

b. Existence and Uniqueness 


The TLAR(p) models p 2 ! have the form 


D 
Xa) EC (II.E.À.1) 
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where {xX} is assumed stationary with Laplace marginal distribution; 
1 ; 
ES d'Ee Ei = K. is a p-variate discrete random variable independent 


X 


of te} and X For all n 


n-1' ^n-2'**** 


p 
WA AA A y as sss sos (quur 2) 


i) 


so E(K 
n 


) = a, for all i - 1,...,p. The 2P choices of model arise from 
the selection of signs for each of the X _, (either *1 or -1). 

Now if {x} is stationary, then the following expression for 
the characteristic function of the i.i.d. innovation, e,» follows from 


(II.E.4.1) regardless of the choice of signs on Xm (The distribution 


of a symmetric random variable Z is the same as that for -Z). We have, 


p 
ylw) = ELexp{-iw( } K 
n isi 


(i). 


n A 


D 
= > Cw) t) a, dy AA A 


n 1S] nt 


KOO 


= b (w) ES Ar vs 


n 


from conditioning on K the stationarity assumption of (X, ] and the 


independence of En O ee Ar Therefore, Subst r ucinga sm 


TEBA) 





ow) = Í ! VEH + j| 


1*0? 


AAA ES E. Hn 


For À > 0, (II.E.4.3) is recognized as the characteristic function of a 


scaled Laplace random variable with scale parameter vA. 
Since (II.E.4.1) can be written as 
(i) 


p 
Ko L a h. + (K c MEM (11, Bangs) 


and satisfies the conditions in Section II.C.2, the TLAR(p) models are 
RCA(p) models. Sincesthe Innovation ie} and (K ] are i1.i.d., then 
TLAR(p) are strictly stationary and {XJ is the unique solution by the 
theorems of Nicholls and Quinn [Ref. 16: p. 31 and p. 37]. 
e. Correlation Structure 
The TLAR(p) models are E autoregressive in the sense 


that ECX MX EO D c aa 
n 2 


He un Zu = E is a linear function 


n-p 


X. dec] D It is also autoregressive in the sense that it 
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Satisfies a set of Yule-Walker equations. Multiplying (II.E.4.1) by 


Kma £21, and taking expectations, we have 
E(X X _,) = a ¡EX "ota pE OQ X S p CIMA. 5) 
Dividing by Var (X ) and substituting $ 1,..:,p Eech SG. 


have the set of equations 


p(1) 


a, + ap(1) SOE USIP 


p(2) = a p(1) ta, t... ta p(p-2) 


p(p) a,p(p-1) * a jp(p-2) pea. KEE? 
where a; = a, (Sign of pt Por o xj. Me 
For the TLAR(2) cases, the (p(1), p(2)) admissible region is 
the entire diamond given in Figure II.C.3.1. It is divided, however, 
into four right triangles, one per quadrant, corresponding to the sign 
of X and X in the model. 
RES D 


d. Conditional Density of x |X X n 


dt 4 xq EDU IS 


The conditional density for each of the 2P specific choices 
of signs are easily found noting that the conditional probability is 


just a sum of Laplace cumulative distributions. We have 
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a POA Lo < x=x ) + a PA L. < xx) + APCYA Lo T 


e Xx A 
sar EM Pe E (11.5 108 
di 





where FL C) is the cumulative distribution function of a standard 


Laplace random variable. Taking derivatives with respect to x, we have 





p x= x. 
1 d x 
D (x|x yeoeyX ) = — ) a.f | | VAF =). 
(IL. EU» 


e. Conditional Maximum Likelihood Estimation of (a... a ) 
Since there are many p-variate Laplace distributions that 
(Kie KAO could have, and that the particular one is not known to us, 


it is not possible to form the exact likelihood function which is 


written 


n 
ly ee 
n 1 i=p+1 i 


|X. poses. fx acaso A: i (11.£.4.9) 
al i-p p 1 


Instead, we can calculate the conditional log-likelihood 


function as the logarithm of the product of the first. (1p) Dern sg 


dal 


(I1.E.4.9). This is commonly done. See, for example, Priestly 


[Ref. 33: p. 350]. Using a, =a sign(X, ), we have the following 


i =i 


Single expression for the conditional log-likelihood function, given the 


n realizations from TLAR(p) process, written as a function of a, for 


AA 





n 
84 ...a)=  ) nif 
1 WAA A 
n | D vs X. 
ee eee ie) ct, E RA n (IO) 
i=p+1 we Gar. oA y 
where 
MES M NH Pea RU EX ELS D. 
l ho^ ZC J (ES 
J AO 
i P3 J 

Í = ptl,... n; qa, = EI and à are functions of the variable a. 


We see that when p = 1 (II.E.4.10) and (II.E.4.11) give the 
expressions used in the TLAR(1) process in Section II.D.4. 

Asta function of p (C TILI E. .1T0) is continuous 
throughout the interior of the p-dimensional subspace on which it is 
defined. It is not differentiable with respect to a, anywhere that 
a. = 0. The maximization of (II.E.1.10) can be formulated as a 


1 


constrained non-linear program for which a numerical routine would 


Pie 


probably be required to solve for (a, ; dE ;, the -joint conditional 


likelihood estimator of (a....a ). 
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III. CONTINUOUS RANDOM COEFFICIENT MODELS 
WITH SYMMETRIC NON-NORMAL MARGINALS 
A. INTRODUCTION 

The discrete random coefficient NLARMA(p,q) models studied in the 
previous chapter offered a variety of different dependency structures 
analogous to their linear ARMA(p,q) counterparts as described in the 
Box-Jenkins NP en to time series analysis. These models, however, 
could be considered deficient in some ways. For one thing, all the 
models have, by design, the same marginal distribution, i.e. Laplace. 
To obtain a different marginal distribution would require starting over 
to develop the appropriate innovation sequence. Raftery (Ref. 32] has 
reported some results in extending the NEAR framework to other models 
with different marginals and ARMA correlation structures. 

Furthermore, the parameter estimation, which is easy to do in 
Gaussian linear AR(p) models, is not particularly easy in the 
autoregressive process of the NLARMA(p,q) family. In the moving average 
and mixed models of NLARMA(p,q), the maximum likelihood procedure is 
even more difficult. Raftery [{Ref. 32] claims that the maximum 
likelihood estimator of B,in the NLAR(1) process would be super- 
efficient based on his work in parameter estimation in the NEAR(1) 
process and the extensions that he has proposed. Super-efficiency is 


not an attractive property of an estimator. 


lie 


Again, the moving average model, NLMA(1), does not allow for the 
full range of correlations that are obtainable with the linear MA(1) 
model. 

Finally, note that there is another attractive property of the 
random coefficient models that is not fully exploitable in the disorete 
random coefficient models (NEAR(1) and NLAR(1)). That is, in the 
NLARMA(p,q) models the coefficients of the process can change somewhat 
over time and the process itself remains stationary. Andel [Ref. 34] 
has noted that in many applications of time series analysis, 
particularly in the fields of hydrology, meteorology and biology, the 
coefficients of the model are attempting to describe complicated 
processes. The coefficients may have some random behavior of their own, 
apart from that usually attributed to the independent innovation 
sequence. 

If stationary constant coefficient models are not particularly good 
at modelling such systems (as suggested by Andel [Ref. 34]), then the 
NLARMA(p,q) models would not be much better because the coefficients are 
limited to a finite (very small) number of possible values. However, 
Lawrance and Lewis [Ref. 6] have shown in the case of NEAR(1) that it is 
possible to alter the character of the sample paths of a given low-order 
autoregression by extending the two-parameter model to one having 4 
parameters. The number of extra parameters could be excessive and the 
costs in parameter estimation unacceptable. 

In this chapter, a different family of stationary random coefficient 
time series models is introduced which retains many of the favorable 


aspects of the NLARMA family (specified marginal and correlation 
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strueture) and offers alternatives in the areas pointed out above as 
disadvantages in the NLARMA construction. 

The symmetric marginal distribution can be specified by one shape 
parameter to be any one of an infinite number of non-Gaussian distri- 
butions. This family is the £-Laplace family and is examined in the 
next section. The family--including as a special case the double 
exponential (Laplace) distribution--has members with extremely high 
kurtosis, as well as those that have a limiting kurtosis that approaches 
that of the Gaussian distribution. This offers a significant advantage 
over the NLARMA models. 

Just as discrete random variables are needed for the coefficients in 
the NLARMA(p,q) models, the square roots of Beta random variables are 
used in this family of models to maintain the £-Laplace marginal distri- 
bution. The square root Beta transformation theorem is the key result 
through which all the time series models in this chapter are formulated. 
By the theorem, Laplace variables are changed into those that have 
£-Laplace distributions. Previous uses of Beta random variables in 
modelling non-Normal time series is evident in the models with Gamma 
marginals of Lewis [Ref. 35] and Hugus [Ref. 30]. 

The fact that the coefficients are continuous instead of discrete 
allows for a continuous variation. That they are functions of Beta 
random variables restricts the variation to a bounded interval. This is 
likely to facilitate the modelling of those "complicated" systems as 


described by Andel [Ref. 34]. 
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The principal models investigated in this chapter are those with 
first-order autoregressive correlation structure. They are first-order 
Markov processes. For the purpose of discussing parameter estimation in 
this family of autoregressive models, as opposed to the NLAR(1) family, 
the focus is narrowed to that AR(1) model of the family with Laplace 
marginals--the so-called Beta-Laplace First-Order Autoregressive model, 
BELAR(1). Several point estimators of location and scale are discussed 
and examined through simulation in SIMTBED [Ref. 15]. The one parameter 
which uniquely determines all the correlations of lag k in the BELAR(1) 
model can be estimated by a least squares procedure which has very nice 
asymptotic properties. The maximum likelihood estimator of serial 
correlation is also obtained using numerical methods. 

First-order autoregressive correlation structure is not the only 
type of dependency relationship that is obtainable from using the square 
root Beta-Laplace transformation. In the last section of this chapter a 
first-order moving average model and an extension to a a "order model 
are introduced. The MA(1) model retains the full range of correlations 
of the linear MA(1) models. This was not the case in the NLMA(1) model. 
B. &-LAPLACE DISTRIBUTION 

1. The %-Laplace Random Variable 

It was shown in Section II.B that the standard Laplace 
distribution belongs to the class of infinitely divisible distributions. 
The probability density function of a Laplace distributed variable was 
given in (II.B.1). The characteristic function of the standard Laplace 


random variable was given in (II.B.2). Thus if 
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tal ` Less) 


; 10270 ( JSIGIB. 15.1) 
then X is a random variable. In fact it is the difference of two 
independent, identically distributed Gamma(%,1) random variables where 2 
is the shape parameter and 1 is the scale parameter. Therefore, if X 
Mase a Characteristic function given by (III.B.1.1), then X is an 
&-Laplace random variable. 

SumosCIII.B.1.1) is a real function of w, X is a symmetric 


random variable. It is easily verified that 


x 0 pt nocbswodd, 
E(X ) = CINEDOR. T. 2) 
(ei ) E44 LJ TE ne ak. lkc'w7152...., 
where btk] = D(b+1)...(b+k-1) for all b > O. Since all odd moments are 


zero in (III.B.1.2), the &-Laplace distribution is not skewed for any 


% > 0. From (III.B.1.2) we find that the kurtosis is 


y y 
EX) =~ CECX 2) [2] [o] 
= z 322 (IB. 1.53) 


The kurtosis approaches 3 as £ > o, which corresponds to that of a 
Normal distribution. 

Since an %-Laplace random variable, X(%), is the difference of 
two i.i.d. Gamma(&,1) random variables, we obtain the density for X(£&) 


by using conditional expectations. 


119 


[SP G, (2,1) and G, (82,1) are the i.i.d. Gamma(£,1) random 


variables, then conditioning on G (2,1), we have 


P(X<x) = P(G,-G,<x) ak A 
Ec (P(G, x*g)) 


2 


CIN ENIRO 


Since Gamma random variables are non-negative, 


0 DD Sa, 
P(G, <x+g) = 
F. (x+g) TI^ SDN T (IL ERIS 
1 


where Fo (x+g) is the cumulative distribution function of G The 


1 


expressions are shortened from G; (2,1) to G(2,1), because they are 


1° 


i.i.d. Therefore, (III.B.1.4) can be written as 


= 
P(X<x) = J F (x+8)£.(8)dy, (III.B.1.6) 
p= (x) 
where 
g" exp(-g) g > 0, 
ACD g T(2) 
0 otherwise, CPAD li 7) 


is the density of a Gamma (2,1) random variable and again, because of 


the non-negativity 


120 


poe 
XU AE (ITI B. 1.8) 


Me erent iating (III. B:1.6) using Leibniz’ rule Dor the 


derí vative of an integral with variable limits, we have, after some 





simplification 
id 
£, (432) = J erg Ix exp(-(2g*u)) dg. (III.B.1.9) 
g-L(u) BE 


Now if 2 is a positive integer, (III.B.1.9) can be evaluated 
analytically using integration by parts. If 2 = 1 we obtain the density 
of the standard Laplace distribution. For 2 = 2,3,4 the densities are 
also well known derivations given, for example, as textbook problems by 
Feller [Ref. 25: p. 64]. Feller however looks at the results of 
Gms 1.9) as the n-fold convolution (m= 2,3,4) of i.i.d. standard 
Laplace random variables. Figure III.B.1.1 shows the densities of tne 
£-Laplace random variable for $ = 1,2,3,4. Note how the graphs take on 
the shape of a Normal density with oa? = 22. 

2. Numerical Evaluation of the %-Laplace Density 

If 2 > 0 and is not an integer, then (III.B.1.9) must be 
evaluated numerically. We will be interested in the evaluation of the 
density in (lLlive.1.9) for 0 < R < 1, in order to calculate conditional 


densities and likelihood function. 
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Figure III.B.2.1 displays examples of densities for non-integral 
£ obtained by using the IMSL numerical integration scheme DCADRE to 
evaluate (III.B.1.9). The upper limit of integration in (III.B.1.9) is 


replaced by a suitable constant m > 1. Since for g > 1 and fixed £ and 


> TO, 

1 1 TS expi-tut2z 

sors expí-(2g*u)]) <  — e , CITPSB:2.1.) 
then 
P I exp{-(u+2m)} 

|DCADRE-f (u; 2)| < TES e (TLB. 272) 

Difficulty in integrating comes about because of the singularity 
at the lower limit of integration. Df £2 this Singularity 


disappears by rewriting (17G(Gi)0)" ^ as ee) For HL < 1, there 
are two alternatives for removing the singularity. We can transform the 
variable of integration, g, to become t - D and tne singularity at 
g = 0 is removed. Or, we could do an integration by parts to remove 
either the singularity at g = 0 for u> O or at g = -u for u < O. In 
either case, the remaining integral must be evaluated numerically for 
ues 0. 

Since X is a symmetric random variable we can rewrite 
(III.B.1.9) using integration by parts to obtain an expression that will 


be easier to apply. For all u = 0 
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» gs» 2 MEE 
£y (u; 0) - expt ub) J IRA exp(-2g)dg. (III.B.2.3) 


2= 
g-0 (g+|u|) 
ALAS .5 note that fo (u) is not defined at u = 0. For 2 >-.5 and 
u = O, 


f,(0,4) = P(28-1)/{T2(2) gem 


}< @ , (DBIEESB. 2.8) 
3. The Square Root Beta-Laplace Transformation 
mane principal result of this; ¿section is the proof of the so- 
called square root Beta-Lapiace transformation theorem. By mis 
technique, an &, Laplace random variable can be transformed into an 


$ 2.. The time series models 


%& “Laplace random variable where ho S 4. 


2 
developed in Sections III.C - III.F rely on the following: 
Theorem: 
Let X ~ %£-Laplace and B - Betal(a,%£-a), where 0 < a < £ and B is 


= Bo ex, then 


defined on the interval [0,1], i.e. standard Beta. If Y 
Y - a-Laplace. 
PPOOF : 


By conditioning on B, we obtain the following expression for tne 


enaracteristic function of Y; 


$y (0) = es 
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E [E[exp(ib “Xo))J 


EC (17 (108) )*]. (TIT. Boje) 


Since bw* > 0, a convergent power series répresentation of 


(III.B.3.1) is given by 





o [k] 
$, (0) - E Í ) E A (IIL B: 3) 
k=0 
where again glk J = 2( 241)... UG k= 1) forn an 0 Só 


Interchanging the expectation and summation in a convergent 


power series gives 


` gtk! , 
Oe = —— (-w~*) 
y pio aii 


SEE (LR 


From Johnson and Kotz [Ref. 36: v. 2, p. HO], we have 


E( B^) E a Kd lk] for k integer. (ILLE d 


Substituting (EIT. Be 354) ana (TIPP AU p 


[k] 
o 
k! 








QM S (III.B.3.5) 


bw) = 1+w 
CE. DE 


k 


li 71 8 


0 
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C. £-LAPLACE FIRST-ORDER AUTOREGRESSIVE TIME SERIES MODEL 
IE untroduction 

In this section, we exploit the square root Beta-Laplace 
transform to define a 2-parameter first-order autoregressive model ín 
£-Laplace variables. The first parameter, £, determines the non- 
Gaussian symmetric marginal distribution of the time series ensemble. 
The second parameter, a, given the value of 1, determines uniquely the 
Wi | Serial correlation. Since the model is shown to besfirst-erder 
Markovian, a determines the entire autocorrelation function up to the 
sign. We show also that the models are always partially time reversible 
with respect to both runs probabilities and directional moments. 

Writing the stationary time series (Xx, (2)) in the form of an 


additive random coefficient equation, we have 


IA 1/2 
X4 (GR) = A Loi oi. HÄ ER (2 0, 0)L (2), CITT A 


where (X, (2)) is assumed to be a stationary time series with a marginal 
£-Laplace distribution; (al’*(a,2-a) } is an i.i.d. sequence such that 


A (a, %-a) is a standard Beta; (B'’*(2-a,a) } is an i.i.d. sequence 


|: 


independent of (A, (a,%-a)j such that B,(%-a, a) is also standard Beta; 


and (LíC2) is an i.i.d. sequence, independent of the coefficient 
processes, such that L (2) is £-Laplace. The coefficient A Ca, fa) and 
etc. IIA SI 


B, (za, a) are assumed to be independent of X X 


n-1' "n-2' 


assumed that Aes (2) has a 2-Laplace distribution, then by the theorem 


me ection 111.B.3 so does Xk). The fact that the process is 


Ved 


Markovian follows by construction. To start the process na 
stationary distribution set Xy 0) = LY . 

The parameter space is l > O and O Sa & i1. 

For the Beta random variables A, and B. (hence their square 
roots) to be properly defined, each of the parameters must be positive. 


Hence, when q = O or q = $, (III.C.1.1) as defined above is no longer 


appropriate because each of vi" and Bi /* has a parameter that is 


Wee 


identically zero. If a = 0, it is understood that the (A ) sequence 


is identically zero and the (B °°] sequence is one; therefore, 


(III.C.1.1) becomes X (4) = L C2) and the {xX} sequence is the (Lj 


corresponding to the i.i.d. &-Laplace case. For a = 2, AT is one and 


SC is identically zero; therefore, if Ko = Lok fi, then X is 


£-Laplace, but is not an ergodic process. 


If we let 
1/2 
Se Ba (2 a, aJL (2) (TITLE) 
then by the Theorem in Section 111.B.3, Eer" (1-a) Laplace wi m 
F = 0 and Var(e,) = 2(£-a) for all n. Since the variance must be 


non-negative, it is also necessary that a S %. By the stationarity of 


{x} and since al’? (a,2-a)X_ (2) is independent of e,» the 


za 


characteristic function of the right-hand side of (III.C.1.1) gives 





1 ee La 1 L 
EE Ga eet ` tia} i "T 
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Examples of sample path behavior for selected 2 and a are given in 
Figure III.C.1.1. Note that although the correlation coefficient is 
approximately 0.8 for all sets of 2 and a in Figure III.C.1.1, there is 
considerable difference in the sample path behavior as £ changes. For 
the samples from small values of %(.10 and .05), there are runs of 
values that are very nearly zero in magnitude. 
n Correlation Structure 
Using equation (III.C.1.1) reeursively along with the 


stationarity and independence of the process (X ), we have 


E(X GX, 102] 


(Ü) = ——— ooo 
E(X*. 1000] 


p(1) = Corr(X, (£),X, , 


17/2 
EL (A, Ca, R-a)X (2) te HX, ,(2)] 
E(X* 100] 


1⁄2 2 
EL A. (a,%-o))E(X ,(2)]) 


2 
E(X* .(2)] = ELA, Lë, dek, (Da .C.2. 19 


From Johnson and Kotz [Ref. 36: v. 2, p. 40], we have for 
i - Betala,%-a), that 


or Rm 


2:70 PTT. C see 
TO forall n, r 0 ( ) 


E(A (a, %-a)) = 


where T(.) is the incomplete Gamma function. Therefore 
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goer 2) Cl), poh atl /2) Toei) a 
AD FAT) ^ RT(Re172) T (a1) apo .2:9) 


Meee that as a > R, then p(1) > 1. Similarly as a * O, p(1) + O. 
uu Ore, we obtain a full range of positive correlations in a 
one-to-one function of a for any given value of £. 

Also from (III.C.1.1), we see that tne process is explicitly 
autoregressive. It is also autoregressive in the sense of expectations 
in that E(X G2 [x (0 suis a lineanotumetionsofue Sia 
(III.C.1.1) defines a first-order Markovian process, p(k) = 91) IKI for 


all k. To see this we write for all k 


B(X (X 
22 


E 
p(k) = K 


EIN, OX MOI 


1/2 
= ELA, A: Zn 


G l) 1) 


0(1) p(k-2) p(1) 


coi} IKI, 


1 
If we replace A (a, La) in (MIME. 1) by -Al (a, ba) we have 


iat 1/2) TCL) comandos) 


pe) m F($«1/2)T(a) ^ WELT STAT) E CS) 


Toll 


We can, therefore, achieve a full range of negative correlations, and 


likewise 
p(k) = C ts nh for all k. (LL 


3. Partial Time Reversibility 

The &-Laplace first-order autoregressive models are partially 
time reversible, both with respect to the directional moments, 
(R OOX hl for m = O, +1, +2,..., and with respect to runs 
probabilities, P(X ($)«X _.(%&)) = P(X (W%)>X _.(&)). 

Using mathematical induction, stationanity of (X (%)), and the 
independence of the coefficients and the innovation from each other and 
previous values of [X (%)), it is the case that LEI ak All 
= FIX (LX _ Oh = 0 for all n and for all m = 0, 1 TOS 
X ` &-Laplace. For m = 0, E(X 2) = 0 by (III.B.1.2). Assuming for m = k 


that E (X "X ) = 0, we have for m = k+1 after substituting from 


Il = 


(TLIC wel) anda Crier esl e nao 


2 _ 2 e 2 
Ge EE p Fo n ` mn 
= 2 
BAA D 
= 2 = 
E(A )E(X2X__)) = 0. (TIT Ces ap 
Assuming for m = k that EX Xp? = 0, we have for m = k+1 after 


substituting again from (III G Ii) anda ie) 


TS? 


E(X X? Bex eer 


n n-(k*1) n-(k+1) n n 
1/2 
= E(A, scene 
1⁄2 
= ECA, JE Aal = 0, CERIC. 3:29) 


To see that this model is also partially time reversible with 


respect to runs probabilities, we show that the random variable KÉ = X, 


e. is symmetric. Now KÉ is synmuetric™mir amdan onI y METNE 


characteristic function of Z, is real valued. We write 


))] 


$50) Elexptiw(X, _ X 


n-1 


E[exptiute, -(1-A x. .32] 


UE 








= Elexp(iwe ))Elexpl-iw(1-A, p 
1 = ? e 
E $ E [ELexp(-iu(1-a X, 31] 
_ 1 po i 1 ^ 
= a A (Ade) 
5 " ne 


Since (III.C.3.3) is real valued that concludes the proof. 
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D. THE BETA-LAPLACE AUTOREGRESSIVE MODEL, BELAR(1) 
1.2 Introduction 
In this section, we set 2 = 1 in TITOR DI a anda AAA EA 


obtain the following expression for the BELAR(1) process 


X= AL (a, 1-2)X, te, (III.D.1.1) 
where le) is an i.i.d. sequence with GE (1-a)-Laplace with moments 
and density given by (ITIL. BI. 2 md III p pon X now has a standard 
Laplace marginal distribution. The only parameter in the model is a 
with O Sa < 1. All the results of Section DEDI .COStil hold with {ee 
Examples of sample path behavior are given in Figure III.D.1.1. 

We do two things in this Section. First, we derive the 


equations for the conditional density of X |X The second is the 


np 
derivation of joint density and the logarithm of the likelihood 
function. The expression is used in Section II1.8.6 to obtain tue 
maximum likelihood estimater or a 

2. The Conditional Density 


To find the conditional density of X X we will need the 


density of a la,la). Let An be a standard Beta random variable with 


n-1’ 


parameters (a,1-a). Since A is defined only on the given interval, 


zero to one 
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2 
1/2 P(A «a ) Om ma c ] 


P(A = <a) = 
n 0 otherwise 
x=a 2 
s: | as IN E CETT CDM 
x=0 E 


where fa (x;a) is the standard Beta(a,1-a) density given by 


aec REO) 0 < a OP 


n 0 otherwise. (III p 2 


Differentiating (III.D.2.1) with respect to a, we obtaim the follow mm 


expression for 


2 4,4271 
E TACHO = TC HESS FM ; O. < ae de (ITI.D.2 9 
m (1-a ) 


Examples of (III.D.2.3) are given in Figure III.D.2.1. 
Now we evaluate P(X, 4 x |x m using (III.D 1.1), (TI. B 


and (III.B. 35) Condon on at’? (a, 1-9) we obtain 
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II 


1⁄2 
P(X <x|X -.=y) = PRIA A Op 


172 1/2 
= ins < x A (a, 1 a) |A =a} ] 
n 
- E 1/2! Cn « x-ay)] 
n 


- po (x-ay)] 


n 
n 
a=L,(x) 
= F. (xay) TACHO da , (III ER 
n A 
a=L (x) n 


where from (III.B.2.3) the cumulative distribution function of e, can be 


written as 


| u=x-ay 
ae | f. (uj1-a)du if x-ay 2 0, 
u=0 n 
FS (x-ay) = (TIII.D:2 
n 
u=a y-X 
J f. (u;l-a)du if x-ay < O, 
u=0 E 
and L (x), i = 1,2 are the limits of integration on a which may be 


functions of e 


SS 


Since Fe (x-ay) changes definition for negative and positive 
n 


(x-ay) and since 0X< a < 1, we rewrite (III.D.2.4) based on the ratio 


x/y, which is a constant. Thus 


a 
J F. (x-ay)f ada” HERE 


1/2 
a=0 " ET 
A y) = CUIIED.26) 
a=X/ y 
| F. (x-ay)f , /208322da 
n A 
a=0 n 
az] 
| Í | F. Ux=ay if 1 osaia)da ee See 
| n A 
| a-x/y n 


Differentratimgs (111.07 2.4) with respeet tomx using Leibniz' 
rule gives the following general expression for the conditional density. 


We have 


x «In 


d 
| pue xo MN 
D n-i 


a=L,(x) 


= | f. (x-ay;1-a)f (a;a)da 


A p12 
a=L (x) n 


+ De (x-yL (x))f TA 
n A 


n 


(x);a) LO 
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= F. (x-yL.(x)f 


d 
à NAE E (TL1.D. 227) 


n 


From (111.B.2.3), (111.D.2.3) and MID Set 


E 2349" exp(-(2g*|x-ay |)) g "(2g*2|x-ay|*a) 


CIII.D.2290 
T(1-a) T(a) (1-2)? (gs|x-ay|) "9 (1-0) 


Now using (III.D.2.7) to differentiate each expression in (III.D.2.6), 


we have the following explicit expressions for 


Ë 
J J h(g,a)dgda if Ewe or x/y <S" 


, 


(Ill D oo E 
n-1 


n 
a=x/y H sm 
J | h(g,a)dgda 
a=0 g=0 
a=1 RE a 
* | | h(g,a)dgda if O< ay 
a=x/y g=0 


It will be seen later that working witì (III. D. 2.9) wi TEEM: 


inconvenient. Hence, we rewrite (III.D.2.9) as 
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IAN 


IV 


a=1 
J f i(x-ay);i-alf 1p asadda 
220 n A 


n 


WA 
© 


on x/y 


(x|y) - (III.D.2.10) 


=X/ y 


a 
J P. Dos ft WA 
d n A 


0 n 


(a;a)da 


a 


a=1 


+ J o as 


a=x/y n 


(a dae 0 <ux/y < 1: 


The conditional density in (III.D.2.10) can assume different 
shapes as a function of x depending on the fixed conditioning value, y, 
and the particular, fixed a. If a = 0, then (III.D.2.10) becomes the 
standard Laplace density as given in (II.B.1.1) with u = 0 and A = 1. 
If y = O, then (III.D.2.10) becomes the (1-a)-Laplace density as given 
in (III.B.2.3) with $ - 1-a. In Figure III.D.2.2 are presented different 
examples of (III.D.2.10) for a fixed y and different values of a. Note 
murMc a «172 then (III.D.2.10) is continuous for all x. If mw š 1⁄2 
ENG (IPIE D.2.10) Ps undefined, e.gl, x = y = 0. 

In a similar manner, expressions for (III.D.2.5)-(III.D.2.10) 
can be derived for the BELAR(1) model with negative correlations. 
Placing -al * (a, 170) in (III.D.1.1), we replace x-ay by x*ay and 
determine the appropriate form of the conditional density based on the 


ratio (-x/y). We have for the negative BELAR(1) process 
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a 
J f. ((x+tay);1-ajf y (450)da 
n A 
a=0 n 
MEX y eR Or -x/y š 0, 
Es |x (x|y) = (Tpr D 2.11) 
no n-1 
a=-x/y 
| f. ((x*ay),1-a)f , ,,(a;a)da 
n A 
a=0 n 
a=1 
+ J f. occ j o) ; /2 (830208 
n A 
a= -x/y n 
| IGO € -—-x/yce 1. 
3. The Joint Distribution and the Likelihood Function 
AN expression for the joint density of AS can be written 
using f, x (x_]x,_,) and f, (x,) as follows: 
me n=] 1 
nal 
f pae EE O PATA ON. [X 
X ++ X, P 1 D ee Xn-(k-1) nx REEL EE 


(IID 3S 


The log-likelihood function as a function of a given . us uste 


natural logarithm of (III.D.3.1). We have 


nzi 
L(a) = -(1n 2 + |x, D * } Inf, 


TO 
k=] n-(k-1) E 


ES *n-k 


(ITT Dn) 
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It is now a simple matter to determine which branch of 
(III.D.2.10) or (III.D.2.11) r S rc q GR Or mb ón (X aX VQ) and to 
substitute it into the sum in (III.D 3 O BB OR (C OC E E 
discussion of the likelihood function until Section III.E.6. 

4, Numerical Evaluation of the Conditional Density 
a. Introduction 

This section is devoted to explaining the methodology by 
which we came to resolve the problems in the numerical integration of 
the conditional density. This is as important an issue as the 
derivation itself, since the likelihood function and the maximum 
likelihood estimators can not be evaluated without it. As is pointed 
out below, the standard numerical routines were unsuccessful in 
accurately evaluating (111.D.2.9) aroUnd the Singularicreseim 
(III.D.2.8). We also give and justify the approximations that were used 
to remove each of the singularities. The graphs in Figure III.D.2.2 
were obtained using the method. The methodology was used again in 
Section III.E.6 to evaluate the log-likelihood function in the method of 
maximum likelihood estimation. 

In the FORTRAN routine that caleulaves the wcondee en 
density as given in (III.D.2.10); the approximations in (1I120-4- 073 
(III.D.4.8) and (III.D.4.11) are added to the results from DCADRE. 
Combinations of these approximations are invoked as necessary depending 
on the ratio x/y. 

The same procedure is used to evaluate the density in 


(III.D.2.11) for the BELAR(1) model which produces negative correlations 
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for odd lags. We just check for 0 < -x/y < 1 and choose the appropriate 
value of cin (III.D.4.6) and (III.D.4.8) where x-ay is replaced by 
x+ay. 
b. The Methodology 

Attempts to evalute the conditional density, as given by 
(III.D.2.8) and (III.D.2.9), using the standard IMSL double integration 
routines failed. Even the IMSL routine DBLIN which is often successful 
in handling ill-behaved integrands, was unable to evaluate (III.D.2.8) 
around the singularities. For a < 172, along the lines a = O anda = 1, 
(III.D.2.8) is unbounded. Similarly for a 2 1/2, along the line a = ] 
s L he point” (g,a) = (0,x/y) for O < x/y < 1, (III.D.2.8) is 
unbounded. Arbitrarily declaring (III.D.2.8) to be zero under these 
conditions did not always allow DBLIN to accurately evaluate 
"upon. 9). 

We succeeded in evaluating the conditional density by 


working with the form given by (III.D.2.10) with f (a;a) given by 


D 
n 


imm D. 2.3) and E {(x-ay);1-a} given by (III.B.2.3). We used the IMSL 
n 


routine DCADRE to construct an extensive table of values for the (1-a)- 
Laplace density with the intention to linearly interpolate from the 
table as needed. The error in the value of f (Jul;1-a) in the table is 


controlled by DCADRE. The error in the value of f (|ug|;1-o) obtained 


ol 
by using linear interpolation for IM Hogmrm the tablesis calculated in 


the standard way. From Gerald (Ref. 28: p. 168] 


145 


d*f. (c;1-a) 
h*s(s-1) n | 
— 


3m (III.D.4.1) 


Error Interpolation| = 
where h is subinterval length ands = (ug7u)/h. Substituting the second 
divided difference into (III.D.4.1), in place of the unknown second 
derivative and also noting that the worst case for linear interpolation 


is at the center of the subinterval, we have 


Error Interpolation| < -5 AE, (lu|;1-o)] , (TI p 2) 
n 


where Den. is the second difference. Because f. (lu|;1-o) is non- 
n n 


negative and monotone decreasing in |u|, the largest values of A2f ane 
in subintervals close to zero. The table that was constructen 
therefore, uses smaller subintervals close to zero and larger 
subintervals further out. 

Finally we used DCADRE again to evaluate (III.D.2.10) except 
near the singularities, which we were able to evaluate analytically and 
then add back. The technique is often referred to as "removing the 
singularity". 

c. Removing the Singularities Due to (III.D.2.3) 

We now describe how we evaluated the integrals in 
(III.D.2.10) in the vicinity of the Singularities ine eeepc. > ae 
see that the density of A * (a, 1-0) given in (111.D.2.3) 138 undef imewdear 


a = 0 and a = 1 fora < 1/2 and at a = 1 for ae 1727 We "also mere tre: 


(III.D.2.3) that for sma ç ROP O ec ale = 
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244971 


> oO  — : ED Y. 
VA s : TÜRDTOPES ' Orc a < 5 (TETED 3) 
n 
amdgeror adii O0 <la ] 
2a 
f 12 basa) S 2 HS UO. ae k. ES cca qme (III.D.4.4) 
A, T(a)r(1-a)(1-a?) 


Therefore for a < 1/2 and 1 $ x/y or x/y S O we have from (III.D.4.3) 


= zai 


2a 
Tla)Fli-a) fg | (x-ay);l-a}da. 


0 


bor a) f ore = 


0 En a 


E —— f» 
Il *—— © 


E ODE D. 259 


Since m (*) is continuous in this situation, there exists a number c so 
n 


EU < c < ó and Ix| $ |x-ey|uSeIx-6y]| and 


a= 544071 a= 2071 
J uaaa e ae aada = CE -a] [ TUI SH da 
a=0 a=0 
2 
8 — Clea 
ñ Ps ELO 2) nilo) 
(IIT D. 6) 


natural approximation for c allows ¡oy | to be the average, 
(1/2)|2x-óy]. 


For all a and 1 < x/y or x/y < O we have from (III.D.4.4) 
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a=1 
J fi (a t. CS ` 


a=1-6 ba 


. 1 2a p es 
J TOR =i mansn t, ay); lada AA 
a=1-6 | 


Likewise there exists a new number cso that 1-6 < ce < 1 and 


Ix-y| « Ix-ey] « |x-v*éy| ana 


E — l| E e ( (x-ay);1-a)d 
J CORA) ü E 
a=1-6 
Se | za 
= f. i xov lem) J Tara) (=) a 
n SÉ 32) 
es 
= f. EE =a] T(1+a) Plo-a) (IIED US 


n 


Again a natural approximation for c allows Ix- ey | to be the average, 
(1/2) | 2(x-y)*$y |. 
d. Removing the Singularity Due to (III.B.2.3) 
The final type of singularity occurs when 0 < a = x/y < 1 


and a 2 1/2. When this situation occurs we leave P (+) under the 
n 


integral and argue that in a 6zneichbophood around x/y iE 


f TACUIT = f 1/2 7 ya) . Note that by the same argument that gave us 
A 


A 
n n 
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(TIT.D.4.6) and «4lIII.D.4.8), there exist two numbers c, and C, SO that 


(x/y)-6 $ c, S x/y and x/y $ c, S (x/y)*6 and 


1 


a=x/y 
J Z^ Jo (8500, {(x-ay);1-a}da 
a=(x/y)-6 on d 
a-(x/y)*6 
+ J BAA Ca ((x-ay);1-a}da 
n 
a=x/y n 
a=x/y 
= A J f. ( (x-ay);1-a]da 
n a=(x/y)-6 E 
a-(x/y)*6 
iD J f. {(x-ay);1-a}da. (III.D.4.9) 
A n 
n a=x/y 


We chose to approximate Cc, and C5 DOCM ALO? Xy « Ll, and have 


f 4/2 07 Y 3a) Wu xpo UE nS E-UNMcoDpSxe- OQ and y = 0 simultane- 
A 
n 


ouv. the value of (III.D.2.10) is undefined for a 2 1/2. 


Now changing the variable of integration so that (x-ay) = u, 


Weenave from (I1II.D.4.9) that for all @ 2 1/2 


a-x/y a=(x/y)+6 
f. ax ay) ada = J m Ix ada 


as(x/y)-6 š A y n 
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US 
= TT f. (u;1-a)du 


u=0 2 


HA 


P 
E 28 (III.D.4.10) 


because ES (+) is a symmetric density. That is (III.D.4.10) is an 
n 


! 1 A em 
expression for TA wo eS |y5|) where e, is the (1-o)-Laplace 
innovation random variable. Therefore, we add back to the DCADRE result 


the amount 


1 1/2 
( Ip (KU yai POA aa lys|)) S tp f (x/V¥3sa) < © 5 yer Os 
[y] A1? n y D 


i n (TIT. Da dee 


We choose the following combination as the value for 
DEN ES lys|). 


i) Using the trapezoidal rule and the table of values for 


the (1-a)-Laplace density we found 


u=M 
P,(O < e Pp o Po J f. (uj1-o)du . (ILIQ0D.. 02) 
n 


u=| y 6| 
Equation (III.D.4.12) is the average of the upper and lower Riemann sums 
of the tail of the density subtracted from 1/2. Using (III.D.4.12) 


instead of directly integrating f. (u;1-a) from zero to |y$6| is 
n 


preferrable, because for a 2 1/2, f. (0;1-a) is undefined. The error in 
n 
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(i 2) from using te trapezoidal rule approximation is 


h? 
approximately EL DNE (i1)| in the SC subinterval. Even though there 
n 


are over 400 subintervals, the second differences Af (i) are very much 
smaller for a 2 1/2 in the interval [|yé|,M]. 


ii) A second measure of P(0<e <|yé|) is the lower sum 


OS = Ee cue (Ciel 71:3) 


since P(0 € e, « lys|) is always at least as large as (III.D.4.13). Our 
approximation for P(O < £ < lys|) is the maximum of (III.D.4.12) and 
(I11.D.4.13). We use the maximum because P, given by (III.D.4.12) could 


be negative when |y6| is close to zero. This follows because P (u;1-a) 
n 


is strictly decreasing for u > 0, and thus the trapezoidal rule over- 
estimates the integral in (III.D.4.12). 
E. PARAMETER ESTIMATION IN THE BELAR(1) PROCESS 
iE Invroduetron 

In this section, we develop estimators for the parameters in the 
BELAR(1) process and report results on properties of these estimators 
obtained from analytical comparisons and simulations. We examine 
estimators for the location parameter, u, and the scale paramter, À, 
of the series (X ); the parameter, a, of the random coefficient 
A! * (a, 1-2); and Y, the lag-1 serial correlation, which is a monotone 


function of a. 
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The theory of conditional least squares estimation for the 
BELAR(1) process using the linearized residual is derived using results 
from Nicholls and Quinn [Ref. 16]. We give a corollary to their Theorem 
3.1 pertaining to the strong convergence and asymptotic Normality of the 
least squares estimator of Y, the lag-1 serial correlation. An estimate 
for a is derived using the fact that Y = Ela (a, 1-0)). Also, we show 
that the joint least squares estimator of location and correlation for 
the BELAR(1) process is the same as for the linear AR(1) processes. 

Other estimators of lag-1 serial correlation in the BELAR(1) 
process are derived using the ideas of robust estimation of Huber 
[Ref. 37j and least absolute deviation (LAD) estimation as applied to 
ordinary linear autoregressive models by Denby and Martin [Ref. 38] and 
Bloomfield and Steiger [Ref. 39]. Although these estimators are 
consistent and asymptotically unbiased in linear models, for the random 
coefficient models the results of the simulation study show that they 
have a bias that does not go to zero asymptotically. 


The maximum likelihood estimator of a, is found using an 


MLE’ 
iterative technique with the initial estimate being derived from the 


least squares estimate of serial correlation, YA 

Many of the simulations comparing the different estimators are 
conducted within the framework of SIMTBED (Ref. 15J. From the Summary 
Statistics table generated by SIMTBED for each estimator, it is possible 
to draw conclusions concerning the bias, the variance at different 
subsample sizes, the asymptotic variance, and how fast the estimator 


approaches asymptotic Normality. In the SIMIBED program, one can 


Specify the total number of samples examined at each subsample size. 


SZ 


The total number of samples used is the product of three parameters, N, 
M, and NSR. Three combinations of these parameters were used. Table 
III.E.1.1 is a summary of the number and types of subsample sizes, N, 
and the number of independent repetitions, M, of each type of simulation 
conducted using SIMTBED. 

TADEESSSEPPSES 17 


Summary of SIMTBED Types 


Number of 
SUTER 
Subsample Sizes (N) M SC ions 
Type 25 50 T5 100 125 AI? OO (NSR) 
I 2000 1000 660 500 400 280 200 100 5 
al: 4000 2000 1330 1000 800 570 400 200 10 
EE 8000 HOOO 2660 2000 1600 1140 800 400 10 


Each entry in a Summary Statistios table, which is the output of 
SIMTBED after super replication, is a pair corresponding to a mean 
(average over the number of super replications, 5 or 10) and an 
estimated standard deviation of that mean value. From Table III.E.1.1, 
it is clear that a large number of independent realizations was used in 
the computation for each super replication and the different subsample 
sizes. Because of this, subsequent tests of hypothesis that we use on 
the simulation outputs will be t-tests on the mean of a random sample of 
size 5 or 10 drawn from a Normal population where o? is unknown, but is 
estimated from the sample. 

Before describing each estimator and simulation experiment, it 
is convenient now to summarize the conclusions of this investigation 


into the estimation of parameters in the BELAR(1) process: 


153 


The simulation results from SIMTBED indicate that both the sample 
median and sample mean are asymptotically Normal estimators of y. 
The asymptotio variance of the sample mean is approximately twice 
that of the sample median across all values of the correlation 
coefficient, Y. 

The simulation results from SIMTBED also indicate that the mean 
absolute deviation, given in (II.E.3.2), is an unbiased and 
asymptotically Normal estimator of the scale parameter, i. It 
also has the smallest asymptotic variance of the three estimators 
considered. 

The least squares estimator of Y, the lag-1 serial correlation is 
asymptotically unbiased and Normally distributed. Simulation 
results support this conclusion. 

Simulation of other estimators of lag-1 serial correlation based 


on non-linear residuals of the form iuo. COMM NE 


+ Bf (X >X ) 


= n-1 


indicates that the value of (Y,B) that maximizes the sum of 


squares of R, is approximately (Y Oe 


LS’ 
Robust estimators of serial correlation based on certain symmetric 
loss functions of the linear residual (other than the sum of 
squares) are biased and, apparently, asymptotically biased. 
SIMTBED outputs of the Huber(c), rank and LAD estimators of lag-1 
serial correlation clearly exhibited this result. 

The maximum likelihood estimator of Y, the lag-1 serial 
correlation was computed by the iteration scheme given in Section 


III.E.6 for simulated data from the BELAR(1) process. Results of 


the simulation appear to indicate that the estimator is converging 
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to a Normal distribution with a mean value equal to the true Y. 
In eomparison to the least squares estimator, the simulation 
results indicate that the maximum likelihood estimator has a 

smaller variance and bias at all values of Y. 

Ee estimators of Location 
a. Introduction 

The sample median, m, and the sample mean, X, are two 
commonly used estimators of the location parameter, y, in a stationary 
process with a symmetrio marginal distribution. The sample median is a 
particularly attractive alernative to X when the symmetric distribution 
is also thick-tailed. (It is well known that for i.i.d. processes with 
a double exponential marginal distribution that the sample median is the 


maximum likelihood estimator of u). 


For i.i.d. processes, it is well known (Dudewiez, [Ref. 40: 
p. 221]) that X has an asymptotically Normal distribution, NCO, Vo2/n). 


Likewise, m is asymptotically Normal, NÍO, Y 1/linf v (x ell, The results 


for the sample median hold provided fo (x 5? is continuous in a 


neighborhood around x is positive, and is bounded above. 
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The problem of estimating u from dependent data is more 
difficult. Analytical results exist about the limiting distribution for 
X in ergodic processes and for the sample median for processes 
satisfying certain mixing conditions. (Mixing processes are those for 


which random variables "sufficiently far apart" are approximately 


independent). 
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Since the BELAR(1) process is an RCA(1) process with i.i.d. 
innovation and random coefficient processes, x d, is ergodic (Nicholls 
and Quinn [Ref. 16: p. 37]). Therefore X is still an unbiased 
asymptotically Normal estimator of y, but the variance is modified by 


the factor 


1+2) O lee 24) 


See, for example, Priestly [Ref. 33: p. 343]. 

The problem of estimating the median has been studied for 
cases where the data are qom MM From Heidelberger and Lewis 
(Rer, 41)], we have that the usual order statistic point estimate 
(sample median) is still valid, but the variance is modified by a 


factor, p(x _). Here p(x 5) is the initial point on the spectrum of the 


5 


binary process (I (x )), where 


2 


le af X. eX 
Ix) = (CL P; 2 2) 
O otherwise. 


That is 


n 
m n Var ( ) I (x c)/n). Qut mbi e 
i=1 i 


As was already pointed out, conditions for convergence and 


Central Limit Theorems for the sample median depend on mixing 
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conditions. There are several kinds of mixing conditions. It is not 
known, however, if the BELAR(1) process satisfies any of them. 

However, the LAR(1) process does satisfy the mixing 
conditions of Gastwirth and Rubin [Ref. 15]. Thus, for the LAR(1) 
process, it is known that the sample median has an asymptotic Normal 


distribution with mean zero, and variance given by 


t o 
) ea lesen yiki) + sinh(x ylklyy a | 
k=-0w . 5 SC 





1+Y 
LOL EE o h) 


Gastwirth and Rubin [Ref. 15] showed that for the LAR(1) 
process, the asymptotic variance of X is taco that of the sample median 
across all values of serial correlation. 

The question here is, what are the properties of the sample 
median in estimating u from data of the BELAR(1) process? Also, how does 
the sample median compare to X in the BELAR(1) process? 

Since {x} from both the BELAR(1) and the LAR(1) processes 
have a marginal Laplace distribution and first-order autoregressive 
correlation structure, the hypothesis is that the sample median from the 
BELAR(1) process behaves similarily to that generated from data in the 
LAR(1) process. Also, the relative efficiency of m to X is the same in 
the two processes. 

To substantiate this assumption, the sample median and 
sample mean were compared in Simulation experiments in SIMTBED for data 
generated from the BELAR(1) process. The simulation output is compared 


to the theoretical results for the LAR(1) process. 


Vb 


b. Simulation Results 

For a = .1 and a corresponding correlation EE of 
Y = .17664, the estimators X and m were simulated in SIMTBED using a 
size of Type III from Table III.E.1.1. The results are given in the 
Summary Statistics in Table III.E.2.1. Looking at Table III.E.2.1 for 
N = 100 and greater, there is no evidence of non-Normality from the 
first four estimated moments of the sample mean. The leading 
coefficient in the asymptotic expansions for E(X) and Var(X) do not 
deviate significantly from the theoretical values, i.e. X is unbiased 
and Var(X) = 2.8581/N. 

Looking at Table I11.E.2.2, the Summary Statistics at'a = .1 
for m, it appears that even for N = 25, m is unbiased and the sample 
skewness is fluctuating about zero. The EE however, at each 
subsample size up to N = 250 deviates significantly from a hypothetical 
asymptotic variance of 1.4291/N, the corresponding result for LAR(1). 
This is explained by the kurtosis of the estimate m of the median which, 
although decreasing with inereased subsample size, is still 
significantly different from O until N = 250. The leading coefficients 
in the expansions for the expectation and for the variance are not 
significantly different from O and 1.5291 respectively. Since the data 
are only slightly correlated, we could have expected the sample median 
to behave similarily to that of the case of the completely random 
process with Laplace marginals, i.e. m is unbiased, asymptotically 


Normal, and has a variance with leading coefficient 1/n. 
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For values of a = .5 and .844, with corresponding Y = .63662 
and .89986, using Type II experiments as described in Table III.E.1.1, 
we again compared the behavior of X and m. 

From Tables III.E.2.3 and III.E.2.4, we see that the 
behavior of X is as expected. The sample mean appears to be unbiased. 
For N 2 250, there is no evidence of non-Normality. The estimates of 
the leading coefficient in the asymptotic expansions for the variance 
agree within one standard deviation of the postulated values of 9.0 and 
SCH 

The corresponding results for m are given in Tables 
III.E.2.5 and III.E.2.6. The sample median shows no bias and appears to 
be asymptotically Normal after N 2 250. In each case (a = .5 and 
a = .844) the leading coefficient in the expansion for the variance is 
smaller than the corresponding value for the variance of the sample 
median in the LAR(1) process, i.e. 4.5 and 19 respectively. 

The analysis thus far has indicated that at least for data 
with non-negative correlation in the BELAR(1) process, there is little 
evidence to suggest that the behavior of the sample median is 
significantly different than in the LAR(1) process. From Table 
III.E.2.7, we see the same kind of results that Gastwirth and Rubin 


[Ref. 14] reported. As sample size increases, the efficiency of X 


relative to m drops to 50%. 
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TABCEST ILIE: 2S 


Efficiency of X Relative tom in BELAR(1) for Y > 0 


1 


N Y = +.1766 ll ADOOS Y9 
25 . 6! 269 . 98 
50 50 EG .81 
15 55 255 SE 
100 . 51 252 O 
125 52 . 50 162 
(ES 55 . 49 D 
259 .51 47 : 59 
500 . 50 . 47 . 48 


1. For Y = +.1766 the results are based on a Type III experiment. 
For the other two cases, the results are based on Type II 
experiments. 

We also simulated X and m for negatively correlated data 

from the BELAR(1) process. Type III simulations were used for X and m 

at Y = -.63662 and a Type II simulation for X at Y = -.9. From the 

Summary Statistics for x in Tables MIE. 203 and III.E.2.9, we see NBS 
unbiased and approximately Normal for sample sizes greater than 125. 

Estimates for the coefficients for the asymptotic variance are not 
significantly different from the theoretical values of .4441 and .1053. 

From Table III.E.2.10, the most obvious point to be made is that 

even for moderately negatively correlated data, m is not Normally 

distributed even for subsamples of size 500. The sample median is 

unbiased, but the kurtosis is not decreasing fast enough. The variance 

of the sample median even at N = 500 is almost certainly not 


(1/N)(1*Y/1-Y). However, the leading coefficient in the expansion for 
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the asymptotic variance is within a standard deviation of the 
hypothetical values (1/N)(1*Y/1-Y). This would indicate, for the case 
of negative correlation, a much slower convergence of the sample median 
to Normality than for positively correlated data. 

For negatively correlated data from the BELAR(1) process, we 
have observed that X does not lose efficiency relative to m as fast as 
for non-negatively correlated data. In fact, from Tables 111.E.2.8 and 
III.E.2.10, it is clear that the variance of X is smaller than m for 
subsample size N S 100. 

3. Estimators of Scale 
mH Introduction 
In the case of estimating the scale parameter, A, we 


considered three estimators. Since Var (X, ) - 21^, we considered 


^ 


À. = S/Y 2 where 


N 
) (X = Oke (IIIe. 


Since the maximum likelihood estimator of A for an i.i.d. sample with 
marginal Laplace distribution is the sample mean absolute deviation 


about the median, we set 


N 
Rr (IILE Sa) 
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As a third alternative, we chose the scaled median absolute devíation 


about the median, 


IX. - m| 
- ER ; | ATI NEN 


3 i .69315 


The scaled median absolute deviation is a frequently used robust 
estimator of scale [Ref. 38]. In the simulations, we assumed that X 
are Laplace with median = mean = O for all n. Table III.E.3.1 contains 
a summary of the type simulation (as defined in Table III.E.1.1), the 
estimator (A; sAn) and tie values of a and Y that were used. 


TABLE MEI. E. 37I] 


Summary of Simulation Schedule for Estimators of À 


Y -.89986 .17664 .63662 

a . 811 .1 5 
Estimator 

i. Type II Type III Type I 

S Type II Type III Type I 

i. Type II Type III Type I 


b. Simulation Results 
In the Type III simulation (See Tables III.E.3.2 - 
III.E.3.4), using slightly correlated (Y = .17664) realizations of the 


BELAR(1) process, we found the best estimator of à to be A the sample 


2 
mean absolute deviation. It appears to be unbiased for all subsample 


sizes. The skewness and kurtosis are decreasing with increased sample 
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Sizes. ¡But even for M =< 500, the skewness is still significantiy 
different than 0. Using two-sided t-tests with 18 degrees of freedom 
for the equality of means of two Normal populations witn unknown 
variances at the 90% confidence level, we reject each of the hypotheses 


independently that: (1) Var (1,) = Var (à) amd Kt?) Var(A,) = Var (1,). 
The mean relative asymptotic efficiency of A and Ae to d re estimated 


from the regression on variance coefficients to be 76% for A, and 60% 


^ 


COPA... 
3 
Botn A, and E appear from the simulation to be biased. 


From the second coefficient in the mean of regression on average in 


"Re IIF.WÇW.3.2, À. appears to have a negative bias of order(1/N). From 


^ 


Ple 111.£.3.1% Tt appears that A. nes e positive bias of order (1/N). 


3 
However, since tne leading term in the expansion of the mean for both 
estimators is the Crue value of Y, it appears that both |. and $ are 
asymptotically unbiased. 

When we considered moderately to highly correlated data (see 
Es IERCES3.5 - III.E.3.10); Wie diTerences in tae behavior of tre 
estimatons were not as easy to discern. The particular bias of ). and 
fy was even more apparent, especially at the smaller subsample sizes. 


As |Y| increased, so did the expressions for the asymptotic variances. 


mo cech of tle subsample sizes, in boči types of correlation, A, had the 


E 
highest estimated variance. The variance of A was significantdy 
@it@erent then chat of A1, at all lewels of significance anc subsample 


2 


sizes up to Y 500. However, we could not reject that the asymptotic 


A. and A. were the same at each of the two levels of 


variances of A 
pi o um 3 


egreceletilon. 
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4. Least Squares Estimation of Serial Correlation 
In this section, it is assumed, unless otherwise stated, that X 
has a standard Laplace (u 2 0,1 = 1) distribution. If not, standardize 


X by 


x mue oe (LIL. ES 


^ ^ 


where y and A will be specified from those estimators already discussed 


in III,.E.2 and? Aa 


^ 


The least squares estimator of the lag-1 serial corrlation, Yis? 


is derived. First, we show that the BELAR(1) process is an RCA(1) 
process of Nicholls and Quinn [Ref. 16]. Then, we define the linearized 
residual in the BELAR(1) process and state some of its properties. From 


these properties and some results from Nicholls and Quinn for RCA 
processes, we derive the asymptotic properties of Y 


A 


Of Yis are observed also in the simulation results for selected values 


LS' The properties 


of Y. Finally, the joint least squares estimator of location and serial 


correlation are derived for the BELAR(1) process. 


Rewriting (III.D.1.1) by adding and subtracting YA -1° we have 
A A E (III.E.4. 2) 
n ns n i SEN n^ FN 
72 ; 
where Y = ELA, (a, 1a)) as Given by li lL. G22. 3 eer 
l = 1: (AT, (a, 179) - ¥} is an i.i.d. process stochastically independent 


ote p nas nin (e). The variance of the random coefficient is (a - Y?) 


132 


(l) As can be seen fon (III:.C.2.5) and the fact that O € a < 1, 
if we know a, then we also know |Y| and vice-versa. That is, in the 
BELAR(1) process, there is only one independent parameter to estimate 
for the correlation. Now, we recognize (III.E.4.2) immediately as an 
RCA(1) process of Nicholls and Quinn [Ref. 16]. Since (e, ] and 


1 
(A, 


i i - Y} are each identically distributed as well as being 
serially independent and independent of each other, we have by theorem 
2.7 [Ref. 16] that {xX} is the unique strictly stationary and ergodic 
Sorucion to (III.E.1Nn.2). 


There are two ways to look at the linearized residual in the 


BELAR(1) process just as described in Chapter II for the NLAR(1) model: 


1⁄2 
R. = (A. (a,1-a) - Y]X, | * €, CTI es 
or 


RIE. aN e (ITE U74) 


From (III.E.4.4), we see that since {xX} woe stricely Stationary, SOFIS 
(R ). Also, we see E(R ) = 0 and Var (R ) - 2(1-Y?). Lawrance and Lewis 
[Ref. 22] proved that the R. are uncorrelated, but in general, not 
independent. From (III.E.4.3), we note that for any n, i En unless 
a = 0. ¿Except for when a = 0 or 1, Var (R? > Var (e). As a increases 
from zero to one, both Var (R? and Var (e) decrease monotonically from 
Aoo zero. This is evident fron the definitionsof Y in (III.C.@.5) 


with %£ = 1. 
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Two other properties of (R ) are obtained from (ill. hk. 4s5) oy 
conditioning on the independent, identically distributed processes le, 
and (A (a, 17a) - Y) up to time k n - 1l. We have 


172 
ECR |fe A (a,17aà) = Y); k - 1,2,...,n-1] 


1⁄2 
- X HA, en te 2) Elen?) - 0, GI DI. p. ey 
because (AT ^ (a, 1-2) ERIGI e, are independent of the process through 


time n-1 and X is a function only of the process through n-1. 


> 


1/2 
E[Ri| fe,» A, (asia) - YI; k = 1,2,..-,n-1] 


= Ele?) *« x? ,EL(A (a, la) - 132) 


EE OO, (1116: 450) 


which is only a function of a or Y? alone, since a determines Y? and 
vice-versa. 


Now using (III.E.4.4) anda given realization of (x3 of size n, 


n 
we minimize ) R? with respect to Y to obtain the conditional least 
122 


squares estimate for Y. This is the same procedure as described for the 


NLAR(1) process. We have 


zu ; f er (J x .). (ED E ID 
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Two problems can occur using (III.E.4.7), especially for small 
sample sizes. For the BELAR(1) process defined by (III.E.4.2), 


Ie 2 0, and yet it is possible that Y,. < 0 or |Y ES If 


LS 


ZI Yrs < 0, we would estimate that the sample {xX} came from the 


BELAR(1) process with the negative sign on A (a, 1-2). If va > 1, we 


Ls! 


would estimate Y by *1 or -1. 
In order to obtain the "least squares" estimate for a, we solve 


numerically for Die in 


al E Ves MER , (IIT E. 5) 


Iu si ven Y o from (5.7) if |Y ¿| < 1. 


S 


The estimator Y S given bu IIT d4.T) has"the fotlowing 


L 


properties which we state as a corollary to Theorm 3.1 [Ref. 16]: 


CORROLLARY. For Ux] given by (III.E.4.2); (R .) POLI. EAS.) 


and (III.E.4.4), the least squares estimator Y has the following 


LS 


properties: 


^ 


ID» Since E(X*) = 24 <a quo ENT -Y) has a distribution 


LS 
which converges to the Normal with a mean of zero and a variance o. 


given by 
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02 = 1*5a-6Y?. REES EH 


The proof follows from Theorem(3.1). The strict stationarity 
and ergodic nature of (X ) leads to the almost sure convergence. The 
results of (III.E.4.5) and (III.E.4.6), together with Billingsley's 
Martingale Central Limit Theorem provide the results for the asymptotic 


Normality of Yis’ 


A strongly consistent estimator for the variance, y» is also 


given in [Ref. 16] for the general RCA(1) process. For oy in 


(III.E.4.9), this estimate becomes 


n 
^ 4 
_ (1-6,,) n (à, nY?) ) i-1 
G2 = cem LS ) X2 + SE Ee E (TI P N 16h 
1 n n ial i n 
LX L Xia 
i=2 | i=2 | 
For large n (III.E.4.10) is approximated by 
n 
(n-1) (a. .-Y2. ) 2 i-1 
LS IUS qmm 
e E EE TA e ee 
Uc (1 dr q) + = : Cake 
E ° 
i=2 MA 


^ 


where Y ç is from (TILL Eed Grs (CITIES es 
Simulations of the least squares estimator of Y were conducted 


for selected values of Y in SIMTBED using Type 111 plans. The results 


are summarized in Tables III B.lH 7 ] IAEA The 
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results reflect the theoretical behavior of the estimator as derived 
above. 
We note that the joint conditional least squares estimators of u 


and Y in the BELAR(1) process are the same as in the linear AR(1) 


n 
processes. Minimizing the sum D R? where now 
1=2 
R, = (X,71) < VA u), (TET ES. 12) 


leads to the following joint estimators for u and Y 


n ^ n ^ 

ü x. Ae, (III 
; T D" 
122 122 

n n n 

YN (Xy X F<) /"W mm xs) 2” ñ (mipi Sr. 19 
r 0 Te) : Wel 
]=2 122 


For large n these eguations reduce to the familiar ones 


f =X (li tee. 4. 1) 


^ n n 
3 Bä = 2 
E Ln KO EC Le X)?. (III.E.4.16) 


We now turn in the next section to the question of alternative 


estimators for Y given that u = O and À = 1. 
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>. Other Estimators of the Lag-1 Serial Correlation 
a. Estimators Based on a Non-linear Residual 

In this section, we explore other possibilities for 
estimating Y in the BELAR(1) process. There is a question as to why one 
should use the linear residual since the BELAR(1) process is a random 
coefficient process which is non-linear. Secondly, why should you 
minimize the square of the linear residual as opposed to minimizing some 
other symmetric loss function which is a function of the linear 
residual? The answer to both questions is that the least squares 
estimator of Y based on the linear residual out-performed other 
estimators in the simulation experiment. 


Consider the following types of non-linear residuals 


2 
X -YX DUX. —2)9 @se 


-U 
II 


p wu = = 2 ; 
RA X, Ta BX n-191gn(X, ,). (ILII E52) 
x 
EN (III.E.5.1), it follows that u has zero mean and 


eebe (TITS) 


x 
,R ) = e. (LIL SERIE 


Cov(R ,R, , 


3 * 3 x 


x 
Introducing the extra parameter, B, makes the residuals, Ra? correlated 
unless a = 0 or B = 0. If B is zero, then we again have the usual 
Amearized residual in (IIIZ2ESN2U). If @29= 7/10, then the variance is 


a constant, but the residuals are still correlated. It is easy to 
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compute the least squares estimators for Y and 8 from (III.E.5.1) and 
(III.E.5.2). We simulated the estimators of Y and B and compared them 
to the results based on (111.E.14:.4) with R 40 eon Tabl ecm linge. 5.1, 
we see that the different estimators of Y from all three residuals are 
close to the true Y. The result is thacthe estimates of R are very 
GE E Ee 

To see how much the value of Y could change with B fixed at 


some non-zero values, we simulated the least squares estimator of Y with 
B = 0 and the estimator of Y based on (III.E.5.1) with B = YV/ 10 and 


again with B = -Y/Y 10. From Table III.E.5.2, we see that 8 £ 0 

severely alters the estimate of the serial correlation. Therefore, in 

the remainder of this subsection, we consider alternative estimators for 

Y in the BELAR(1) process to be only those based on the linear residual. 
b. Estimators Based on the Linear Residual, R. 

Besides the asymptotically unbiased least squares estimator, 
we considered the following well-known estimators of Y in linear AR(1) 
models: 

1) The Huber(c) function as described by Denby and Martin [Ref. 38]. 
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The estimator, Y is the value of Y that satisfies the 


H” 


equation 


idi S = (III.E.5.5) 
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DATA 


KI 
X2 
x3 
X4 
X5 
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STD 
BIAS 


DATA 
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STD 
BIAS 


DATA 
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= 500 
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56891 
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-. 039089 


LS 
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= 1500 
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TABTE ITI. 3.531 
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.61630 
.62601 
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200050 
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molor] 
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.00815 
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.00560 
. 00236 
20120 
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. 00797 
. 00040 
. 00160 
.00728 
. 00033 
.00629 
.00033 
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Simulation Results for Various Definitions of RA in BELAR(1) 


388562 
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B ' 


.01771 
.01637 
.05808 
.07208 
. 04748 
.03580 
02255 
.03580 


63662 


4- 


. 00013 
03095 
.01093 
. 02359 
.00581 
. 01428 
ONES 
.01 428 


.83463 


.01821 
USA 
.00641 
. 00193 
. 01041 
. 00598 
.01041 


TABLE TITI ES 
Simulation Results for Various Definitions 
of pa to Estimate Y Given B in BELAR(1) 





N = 500; Qu. 5 Y = .63662 
Sé ^x* ^x = 
DATA (Y, ¿18 = 0) (le = —) (y |e = ——) 
10 y 10 
1 . 56891 21552 .27410 
2 .61996 .21515 . 26257 
3 . 62695 . 38621 . 38450 
4 .57995 34356 . 39730 
5 .59236 . 36082 . 40557 
where 
t ir els Tes 
w (t) = (III.E.5.6) 


GucuEn(t) iN NE 


The corresponding weight function "yu (C) is Pp (t)/t and c is 


^ 


a tuning constant. As c goes to infinity Y(t) approaches t and Yu ls 
the least squares estimator of Y. If c = 0, we have the solution of 
(III.E.5.5) is the median of X,/X,_.. 

For c other than O or +=, there is no closed-form solution to 
(III.E.5.5). We obtain the Huber(c) estimator of Y by iterating the 


following scneme: 


es = ——— =s ce SEINE. 5. 70 
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^ 


where Y, is the least squares estimator of Y and 


median |X,| 
i 


D^ CONSENT OU ° ONDES 5.8) 


tfomene scaling constant for the R,. If Y = @, then S, is the median 


i 


absolute deviation estimator of the scale parameter in the Laplace 


zer b5utjon as givenwin Section III.E.3. Typical values of c are 1, 


1.5, 2. We use for illustration c = 1 in the simulation along with fis? 


^ 


the least squares estimabe, and Y,, the median (X. 7X. Js 


M' 1 


2) The Least Absolute Deviation (LAD) estimator of Y is the minimizer 


of 


(III.E.5.9) 


caso 


L. 
ju 


2 1 


i 


^ 


The solution 1s, Yum’ 


the weighted median of X. /Xi where 


the weights are Xi] fon 02, ns 

Denby and Martin [Ref. 38] reported that the Huber(c) 
estimates are consistent and asymptotically unbiased for linear AR(1) 
models. Bloomfield and Steiger [Ref. 39] showed that the LAD estimator 
is strongly consistent and asymptotically unbiased for linear AR(1) 
models. In Figures II1I.E.5.] —- III.E.5.4 are examples from SIMTBED of 
the behavior of these estimators in simulated data from LAR(1), a linear 
AR(1) model with Laplacian marginals and AR(1) correlation structure 
given in Chapter II. These results appear to be consistent with the 


results reported above for linear AR(1) processes. The leading 


coefficient in the expansion for the mean of each estimator does not 
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differ significantly from the true value, 0.63662. We also see that the 
median (X, /X; 4) and the weighted median (Xi /X; i) estimators are 
considerably more efficient than either the Huber(c) estimator in Figure 
III.E.5.3 or the least squares estimator (c = ©) in Figure III.E.5.H. 
Since the least squares estimator remains asymptotically 
unbiased for the BELAR(1) process as was shown in Section III.E.4, it 
was of interest to observe how the Huber(c) estimators, e < o, and the 
LAD estimator of Y would behave. Considering the ordering suggested by 
the simulation results in the LAR(1) process, it would seem possible 
that the Huber(c) estimates could be better than the least squares 
estimator ol r: In the boxplot analyses in FPigures IILE 5.5 
III.E.5.8 are the results of the simulation for Y = .63662, but for 
data from the BELAR(1) process. The boxplots in Figure III.E.5.5 
display the theoretical behavior of the least squares estimator of Y. 
The other estimators of Y appear to be converging to other values 
Yo + yY. To see this, note the first entry in the coefficients for the 
asymptotic expansion of the mean of Y in FiguresMIII.E:.5.6 -~ TPESB SOS 
In each case Y, > Y. AlSo from the estimate of "the Standardi den MECOS 


0 


we assert that Yo is significantly larger than Y for each of the 
alternative estimators investigated here, because the difference, 


| Y - Y |, is larger than four standard deviations. 


ol 

For the BELAR(1) process, we observe a reversal from the 
LAR(1) process in preference for the estimator of Y. We will use the 
least squares estimator as the initial estimator of Y in the iterative 


procedure for finding the maximum likelihood estimator of Y which we 


develop next. 
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6. Maximum Likelihood Estimation of Y 
a. Introduction 


In this section, we develop the maximum likelihood estimator 


of the lag-1 serial correlation in the BELAR(1) process, YULE" We use 


the expression for the logarithm of the likelihood function, L(a), in 


(III.D.2.12) in an iterative procedure to find the values of a and the 


sign of Al ^ (a, 1-a), that minimizes -L(a); call the pair Lä rs sign). 
Since knowing a and the sign of A! (a, 1-a) uniquely defines Y, YMLE can 


be found from (III.E.4.8) using (4 sign). 


MLE’ 

We consider only the univariate problem. That is, we have 
assumed that (X ) is marginally Laplace distributed or have determined 
from Q-Q plots that the best £-Laplace fit to the data is when £ = 1. 
Secondly, we assumed that s is standard Laplace (yp = 0; A = 1) or 
that (X ) has been standardized using a pair of estimators (fh, 1) from 
zccwEons TIL;E 2. and (IT TE. 3. 

As a function of a, (III.D.2.12) is very complicated. There 
is little hope of being able to analytically solve for the critical 
values of a. In fact, the evaluation of a derivative of (III.D.2.12) is 
at least as expensive computationally as the function values themselves, 
Since (III.D.2.12) contains exponential functions of a. However, since 
this is a one-dimensional optimization problem, there are IMSL routines 
that will perform the search without using deviatives--Golden Section 
search; bisection method; or interpolation routines. 

We chose the IMSL routine ZXLSF which performs a one- 


dimensional search for a minimum of a smooth function ina closed 


interval using quadratic interpolation. The FORTRAN routine which 
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evaluates (III.D.2.12) is formulated sov thateZXESBUsUSscaccDNUPUCDN 
interval (-1,1) where a < O implies that conditional densities of the 
form (III.D.2.10) are being evaluated instead of those given by 
(III.D.2.9) when a > O. The initial value for a to start the iteration 
) 


procedure.of ZXLSF is a four-digit approximation (a sign 


DS LS 


^ 


corresponding to the least squares estimate of serial correlation, Yt» 
obtained from (III.E.4.8). 

The queston of accuracy in the calculation of (III.D.2.12) 
is especially important because the likelihood surface is extremely flat 
in many cases. We want some assurance that ZXLSF is efficiently 
searching for the optimum and not "chasing roundoff errors". This 
happened before we increased the accuracy parameter in DCADRE and used 
double precision. In order to assess the accuracy of our calculations, 
we constructed first- and second-divided differences for values of a and 
(III.D.2.12). The divided differences are approximations for the 
derivatives. For those simulations that we checked, there was one 
transition of the slope through zero at the critical point found by 
ZXSLF. The second-divided differences at all points in the vicinity of 
the critical value were positive indicating the general convex upward 
shape of (III.D.2.12). Sometimes there was some fluctuation in values 
of the second-divided differences, but no change of signs near the 
reported optimum. 

The fluetuating values of the second-divided difference 
indicated some noise in the calculations. This occurred in two places. 
If the second-divided difference covered points on both sides of 


a = 1/2, then there was often a jump in the value of the second-divided 
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difference. This occurred because of the change in the method of 
calculating the conditional density when a changed from a < .5 to 
a 2 .D. Other times, slight aberrations in the observed pattern of the 
secoma- divided differences occurred for values of w that were small, 
O< a< .15. This is attributed to the fact that DCADRE evaluations for 
the table of values of the (1-a)-Laplace density (0 < a < .15) in many 
subintervals was not behaving regularly. The computed value was 
accepted because the estimated error was small, relative to the accuracy 
requirements. The important consideration, however, was that no error 
in calculating (III.D.2.12) should be so large as to falsely indicate a 
change in convexity in the vicinity of an extremum, so that ZXLSF would 
be ineffective at locating it. 

The selection of a good starting point in this procedure is 
also important. It is desirable to commence the iteration in ZXLSF as 
close to the global optimum as possible in order to reduce the 
possibility of converging to a local optimium. Note, also, that as a 
function of a, the conditional density 1s not necessarily convex and 
often is not even unimodal across the range from Y = +1 to Y = -1. 

Since (111.D.2.12) is the logarithm of the product of such 
functions, there is no assurance that (111.D.2.12) has a single relative 
maximum especially for small sample sizes. When the sample size is 
small, it is advisable to pick a starting value for the iteration on 
both sides of a = 0. Select the maximum likelihood estimator to be the 
one with the higher value of L(a) if the routine produces two different 


^ 


a's, corresponding to the pairs (à, +) and (&5,-). 
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^ 


Since we know that Vë is aceonsistent, aəymptopioally 


unbiased and asymptotically Normally distributed estimator for Y, we 


chose the value of a and model corresponding to Y as our initial guess 


ES 
lIncZXLSEk. 


b. Simulation Results 
The maximum likelihood routine for estimating Y was tested 
in simulations using computer generated data from the BELAR(1) process 
with known parameter values of £, u, A and a. By performing M 
independent simulations of sample size N (where N is increased for each 
set of M simulations) and fixed a, we were able to compare the standard 


deviation and bias (if any), of Y to that of the initial least 


MLE 


^ 


squares estimator Y for which the asymptotic distribution is Normal. 


ESA 


Changes in the Normal plots for one set of M simulations for N small to 
a second set of M simulations for a larger N would give some indication 


of how fast YMLE is or is not converging to a Normal distribution. 


Both M and N were small in the simulations for two reasons. 


Since the asymptotic distribution of Y was known, it was of more 


LS 
interest to see how much better YMLE was for the smaller samples (i.e., 
was the bias smaller for YMLE or was it, in fact, unbiased). Secondly, 


the run times for calculating (III.D.2.12) for N < 200 was long. The 
evaluation per sample of size N = 25 ranged em 100-300 Secs. om 
N = 175, the run times ranged from 700-950 secs. 

Figures III.E.6.1, IIILE.6.2 and IIT E. 6: are the Norma 
plots of twenty realizations of the maximum likelihood estimator of 
serial correlation and the least squares estimator of serial correlation 


for simulated data from the BELAR(1) process for selected values of a 
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and for two subsample sizes, SSN. The layout provides for two-way 


comparisons. That is, YMLE from smaller SSN can be compared to YMLE for 


^ ^ 


larger SSN. Likewise, for a given SSN, Y can be compared to Y 


MLE LS” 


which is known to have an asymptotic Normal di stri bution seeThemsitraignt 
line in the Normal plots corresponds to a Normal distribution. The 
curved lines correspond to the Kolmogorov-Smirnoff bounds calculated 
from the data at the 95% confidence level by the routine in the IBM 
experimental APL routine called GRAFSTAT. 

It appears from these figures that for the larger values of 


SSN, YMLE and Yis fit Normal distributions better than the corresponding 


samples from the smaller values of SSN. It also appears that YMLE fits 


A 


a Normal distribution as well as the Yis for the larger values of SSN. 


This supports the notion that the maximum likelihood estimator is 
converging to a Normal distribution. 

Figüres III.E.6. 4, WIII.E.6 2 and I1. EO O ae eee 
corresponding scatter plot analyses for the data in the previous figures 


for the larger value of SSN. It appears that YMLE and Tue have a 


positive correlation coefficient which becomes more pronounced as the 


data becomes less correlated. The distribution of MLE also appears to 


^ 


have a smaller variance than Tuer This effect is more pronounced for 


more highly correlated data. Compare, for example, the estimated 


Standard deviation of YMLE and that of Yt from the table in Figure 
III.E.6.4 with the corresponding entries in the table from Figure 


IIJ op o O 
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F. &-LAPLACE MOVING AVERAGE MODELS 
lmuroduccron 
In this section, we derive a time series model that has an 
&-Laplace marginal distribution and the correlation structure of a 
linear q order moving average model. This construction uses the 
square root Beta-Laplace transform given in Section III.B.3. The first- 
order model retains the full range of correlations up to 1/2. 
2. The First-Order Moving Average Model 
Let (L (2-a) ]} be an i.i.d. sequence of (-a)-Laplace random 
Variables; (AL * (a, &-20)] be an i.i.d. sequence, independent of 
(L (%-o)), where WA is a Beta (a,£1-2a) random variable and 0< a < 2/2. 


Both the innovation and the coefficient sequences are independent of 


X 


A Then the sequence (X, (£j given by 


X. (0 - L (&-o) * Al," (a, &-2a)L, (ma), (ir m) 


has a marginal %-Laplace distribution and an MA(1) correlation structure 
such that O 4 Corr(X, ,X, 4) <a 

To see that X4 (2) has an £-Laplace distribution, first note that 
by the square root Beta-Laplace transform theorem of Section III.B.3, 
the distribution of the product Via, Läit (tal is a-Laplace. 
Then note that X 6%) is the sum of two independent random variables, one 
of which has an (%-a)-Laplace distribution and the other has an a- 


Laplace distribution. So, if ey C0) is the characteristic function of 


X (4), then 
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ó (o) = | | We Hi = leif I (IB. eee 


1+w? 


To see that (X (%)) has the correct correlation structure, first 
note that by the construction of (III.F.2.1), X= is explicitly 
independent of X, for |k| 2 2. Therefore, Corr(X,,X, ,) is zero for 
kip 2 2. 


For k = +1, we have, after some simplification 


al(a+s) Pi toi za 
= ——  — GIMMNE. 2. 3) 


Corr (X, ,X à 
&T(a*1)T(£7a*7) 


Nok 


Finally, note that in the limit as a * O, (III.F.2.3) approaches 


geome, A]so, as oa  $/2, (III.F.2.3) approaches 1/2. 


1⁄2 


» (a, L-24), we have a full 


Replace Al /(a,L-2a) in (4.1) by -A 

range (-1/2,0) of nonpositive lag-1 Serial correlations. 
3. The q-Order Moving Average Model 

The MA(q) model for q 2 2 is the extension of the MA(1) model 
Swenzin C(III.F.2.1). Let (L (%-qo)) be an i.i.d. sequence of (£-qa)- 
Laplace random variables. Let US cM equ fr i ="1,...,0 De 
i.i.d. sequences, independent of each other and of (L_(%-qa)), where 
a Beta a, l=(q+tl)a) random variable for all n and all 


NN Gd. ASO, 0 < a < £/(q*1). Both the innovation and each of 


,X .... Then the 


toe coeficient sequences are independent of X, son 


-1 


sequence (X, (2)) given by 
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q 
X (&) = L (%-qa) + 2 A, qlo 2=(q+1)a)L,_, (2-90), CEUTA F: 3 i 


has a marginal %-Laplace distribution and an MA(q) correlation structure 
for 0 <.a < b/(at1). When : 0, then (x, (1)] is an i.i.d. sequence; 
when a = £/(q*1), then the moving average is an equal weighted average 
of q*1 i.i.d. a-Laplace error terms L (o). 

To see that X CO) is an %£-Laplace random variable, first note 
from the square root Beta-Laplace transformation theorem of Section 
III.B.3, that each product MAC $-(q*1)a]L, .(£-qo) has an a-Laplace 
GENEE 

But the sum of q i¡.i.d. a-Laplace random variables has a qa- 


Laplace distribution. Thus, X CO) is the sum of two independent random 


variables and its characteristic function is obtained as the product 


a q 
p(w) E | 1 L qa 1 | 1 jn 


1*u? 1*u 
IECH 
1 $-qa 1 a 1 2, 


The correlations are truncated at lags |k| 2 q+1. By the 


cons truction orir aen X is explicitly independent of Ee for 


Ik| 2 ati. 
Negative correlations are obtainable with 24 choices for 


UE 


replacing or not replacing [A (a, 2-(q+1)0)] by I , ta, 2=(q+1)at J in 


CIII AS 
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This model can be generalized from the one-parameter case by 


Peplagune da in (LII.F.3.1) with a. in each term in the sum, and 


q 
replacing L (£-qoa) by L ($-q E a). 
n n ini E 
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IV. RESIDUAL ANALYSIS COMPARISON OF THE NLAR(1) 
AND THE BELAR(1) PROCESSES 
A. INTRODUCTION 
Lawrance and Lewis [Ref. 22] developed a higher-order residual 
analysis for non-linear time series with autoregressive correlation 
structures. Specifically, they developed a third-order analysis based 
on the cross-correlation of the linear residual, R ° and its squane at 


lag k, r They applied the analysis to the problem of modelling wind 


Kk 
Speed data. It is important to note that this analysis was done in 
conjunction with, and not in place of, the usual second-order analysis. 
As has been already pointed out, second-order analysis is sufficient for 
modelling only when the processes are both linear and Normal. 

The residual analysis involves only joint moments of order three. 
In Chapter II of this thesis, it was shown that for the NLAR(p) models 
with p = 1,2, all the third-order moments--that is, those of the form 
E(X X X) for all i, j, k--are zero. Therefore, the Lawrance and Lewis 
residual analysis will not differentiate between the NLAR(p) processes 
with the same autocorrelation structure. It can also be shown by 


induction on k that Corr(R_,Ró_ NES Corr (X^ ,R. ) = 0 in the BELAR(1) 


k -k 


process. Hence, either third-order residual analysis will be unable to 
discriminate the BELAR(1) process from any of the NLAR(1) processes with 
the same autocorrelationmr unction 

In the spirit of looking at the lowest possible joint moments for 


differentiating between models with symmetric marginals, a fourth-order 
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analysis is proposed. Two candidates are investigated as the basis of 
this analysis. The first one considered is the cross-correlation of X" 


and the linear residual at lag k, R The second is the 


n-k' 


autocorrelation of R^ and R^. The two analyses are compared by their 


Ke 
abilities to differentiate among the different types of NLAR(1) 
processes and the BELAR(1) process. 

Table IV.A.1. contains a summary of the models in the comparison, 
along with the selected sets of parameter values and corresponding 
correlation coefficient. Even though each of the models has the same 
marginal distribution (standard Laplace) and identical autocorrelation 
functions, each has a theoretical cross-correlation function in terms of 


(X ) and autocorrelation function for (Ro Re that are different. 


ap ei 
The question of how the residual analysis is affected by parameter 
estimation is an important issue, but is not addressed at this time. 
Before the candidates are developed in the next two sections, it is 
convenient now to place both the NLAR(1) and BELAR(1) processes into 
their common RCA(1) framework. 


Using the terminology of Nicholls and Quinn [Ref. 16], both the 


NLAR(1) and the BELAR(1) processes can be written as 


X. s10X * B X t ee CIVA T) 


where le} ische i i.d. innovation, Ele? = 0, and otherwise defined as 


221 


1. (1-a)-Laplace in the BELAR(1) process; 


2. standard Laplace, but with a degenerate component at the origin in 


the LAR(1) process; 


3. scaled Laplace where A = y 1-0, in the TLAR(1) process; 


H. convex mixture of scaled Laplace variables in the general non- 


boundary NLAR(1) process. 


Summary of Models with 


Model 


LAR(1) 


NLAR( 1) 


BELAR(1) 


TLAR(1) 


The (B. ) process is the i.i.d. 


Laplace Marginals and Autocorrelations of Y 


Parameter Values 
= 1; d = .19216 
=] d 225905602 
= 1; B, = ¿89986 
= d = .43836 
OO d “a, 
= ie . 94861 
= .11; positive model 
= .50; negative model 
= .884; positive model 


.19216; 
25366265 
. 89986 ; 


B 
B 
B 


1 
1 
1 


independent of le, and (X .] for k 


defined as: 


D 


TABLE LV A 


Y 
. 19216 
203562 


499985 


.19216 
=. 0 3002 
.89986 


2192215 
OS UNE 
.89986 


219216 
25:552 
.89986 


|x| 


Comments 
Linear models; 
one boundary of 


NLAR(1) family. 


General discrete 
random coefficient 


model. 


General continuous 
random coefficient 


model. 


Other boundar y 
model of NLAR(1). 


random coeffrerent processes 


n-1 with E(B) = 


0 and otherwise 


E Bos 
n 


Ca, a) = Y), Rhere y = EAT ^ (a, 1-)) and A, (a, 1-a) is a 
standard Beta random variable in the BELAR(1) process; 
O in the LAR(1) proces3, Since it is a linear, constant 


coefficient AR(1) process; 


3. 84 (Kr 7a] in the other NLAR(1) processes, where K^ is a Bernoulli 


random variable such that PIK, = 1) =a, and O Š 18,1 s 1 and a 


1 1 


and B, are not both unity. At B} = + 1 the process is the TLAR(1) 
process. 


The c is a constant defined as: 


a 
l 


= Eta (a, 1-0) in the BELAR(1) process; 


B, = BECK?) in all the NLAR(1) processes (a, = 1 being the 


1 1 


LAR(1) prooess). 

The second and fourth moments of E and the second, third and fourth 
moments of Ba are needed in the following sections. In Table IV.A.2, 
there is a convenient summary of the necessary equations. 

Now the linear residual, written in terms from (IV.A.1) has the 


following forms analogous to (III.E.4.3) and (III.E.H.H), 


R = B X t: pe e, (RV A 2) 


R = X =I : (IV.A.3) 


Using (IV.A.2) and the independence of (B) and te}, the second and 


fourth moments of R, are 
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ER) ZER) * E(e?), ( IV.A. hi 


FIn ) 2HECB,) + 12E(B)E(€ 5) + Ele). (IAS) 


The variance of Ro when needed is derived from (IV.A.4) and IV.A.5). 
B. RESIDUAL ANALYSIS USING Corr(X* ,R__) 

In this section, the residual analysis using the theoretical cross- 
correlations, Corr (X^, pn) is developed. Using (IV.A.1) and (IV.A.2), 
we have 


e + 3o2X2 R + 3oX Rf * R^, Chie Bey 
n imd Ici Lat ms 


1 1 n n 


where c is defined in Section IV.A. 


ieee er Oss correlation funetion on. X" and Roy at lag k 1s defined as 


EX RE) 
ee s (IV.B.2) 


(Cm (es Corp (x sh 
31 DN "x3 7R 


=k 
n-k 


where Var (X7) - E(X?) - 6! and Var(R, |) is given by (IV.A.4) for all n 
and all k, since as shown in Section III.E.3, H is stationary 
whenever (X ) is. 

Now from the construction of R, 1 TV 7A72), 1t 1S explicity clear 


that X and R, are dependent for all k and that the (RJ are not 


-k 


independent of each other, unless B. is identically zero as in linean 
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constant coefficient AR(1) processes. However, by the Residual Theorem 
(Lawrance and Lewis [Ref. 22]), the (R ) ape uncorrelated. 


From (IV.B.1), we have for all k 


» 3 3 2 2 
C3, O7 (e*EQU. .R.) * Se*E(XT RR. 


2 
nd m CR GER Le : ) 


=K n n-k 


WA 


A [12/5 (ERAN J. (IAB) 


Consider (IV.B.3) for k < O. Since the random coefficients EI 
are independent of the process SE for) S$ not, "thistimplies Lag 


C44 00 is zero for k < 0. For k=0 in (IV. B3), we nave h sr s yp 


simplificari on, 


T2c?E(B? ) * 6c?E(e? ) + 72cE(B?) + E(R*) 
C44 (0) = —. (IV OB N) 
12/ 5 (E(R2)) 


For k z 1, there is the following recursive formula, 


K 
C (17c*) E( €?) 


C. (k) = C_ (K-1)íc2 + 3eEC(CB? ) + E(B3)] * —————— . (IV.B.5) 
jl 31 2 n 2/ 5 (E(R2)) ^ 


It is now a simple matter to consolidate the expressions for C44 00 


for all k and substitute the appropriate expressions from Table IV.A.2. 


For the NLAR(1) models--including LAR(1), for which a, = 1, and TLAR(1) 


1 


for which B. = +1--we have 
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0, k < O; 


_ 202 3 = E = 20% Ss 2 
(2 a. B: * 6a. 8, (1 2a, ) (1 a) a: B, (8 194, +1205) } 
C44 (K) z x DOS TAO AA k = 0; 
Y 10 e 


(IV.B.6) 
*8* (1-a, 82) (1-a, 82) ^^ 
2,8303, (k=1) y , KY eles 
y 10 
For the BELAR(1) process, we have 
0, k «0; 
LA 2E 2 
D - LEA, k = 0; (IV.B.7) 
3/ 10 (1-Y?) 
| mo 
Z1+20)0 4, (1) ME lie) Cla tow) | SEH 


VENIO 


The theoretical cross correlation functions for each of the models 
and sets of parameters in Table IV.A.1 are given in Figures IV.B.1 - 
IV.B.3. Three points can be made. For the models with D small, such 
as in Figure IV.B.1, there is little difference between the cross- 
correlation functions of all four models. (Of course for p = 0, there 
is absolutely no difference, since all NLAR(1) models and the BELAR(!) 
model collapse into the unique i.i.d. case). A difference between the 
cross-correlation function of the boundary NLAR(1) models--LAR(1) and 


TLAR(1)--does become more apparent as D inoreases. But, there seems 
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to be little distinction between the cross-correlation functions of X 


and Ro from the BELAR(1) process and the non-boundary NLAR(1) process 


EK 


with a,- 8, = YTp] even when |p| is large as in Figure IV.B.3. This 


final point suggests the possibility that there exists a pair of values, 
(o, ,8,), whose product is p Z O, for which the BELAR(1) process and the 
corresponding NLAR(1) process will not only have identical 
autocorrelation functions, but may also have nearly identical cross- 


correlations of XT and R, for some specified number lags 


-k 
HESWO, 13...,J. 


The eross-correlations of X? and R, , can also be used to 


distinguish the random coefficient AR(1) processes with a standard 
Laplace marginal distribution from the Gaussian AR(1) process where 
SE seo, 2) and Sai N(0,2(1-a2)], where a is the correlation 


coefficient. We have for the Gaussian AR(1) models, 


0 Kë est 
NES = Corr? R .) =| (3/5(1-a2)' k = O (IV.B.8) 
31 me m k i 
K 
> 
a C31 (0) k > ] 
Note that is C34 (K) for k 1l is proportional to Corr(X, ,X, |) = aX, 


This 18 consistent with the fact that a Gaussian process is completely 
determined by the mean and covariance matrix. 

Figures IV.B.4 - IV.B.6 show the theoretical cross-correlation 
function of the Gaussian AR(1) model superimposed on the values for the 


different models from Figures IV.B.1 - IV.B.3, which have the standard 
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Laplace marginal distribution. There is some differentiation between 
the Laplace models with AR(1) correlation structure and the given 
Gaussian AR(1) model, but not much. It would, however, be very easy to 
identify the Gaussian model from the Laplace models using probability 
Peovs. this illustrates the point made at the beginning of this 
chapter, that a higher-order residual analysis is not intended to 
replace the existing methods of analysis. It also emphasizes one of the 
very foundations of the thesis, that the marginal distribution is one of 
the very first aspects of a time series that should be examined. 


C. RESIDUAL ANALYSIS USING Corr (RR? _ ) 


k 


In this section, the residual analysis using the theoretical 
autocorrelations, Cor (RR |) is developed. 
Let C ASK) represent Corr(R^,R* |) for all k. Since the correlation 


C 


function is an even function and {R} is stationary, C ASK) E C55 


We represent only k = 0,1,2,... . Using (IV.A.2), we have after some 
simplification for k 2 1, 


s (K) ) 
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ll 


2V 2 2 2 e 2 2 
LEI (BA, * 2B X, 46, * Eos ) E(R )E(R 21 / (0,29, 2 


` naon K 


[E(B )E(X 5 _ ez ) 


| Dk 


EES 7 EE 
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2 2 2 
(E(B7) Cor uch ll 7 (082052 
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- E(B?) Cov(X?,R ) / Var (R5). (INCAS 


2 
nexo 

Now an immediate advantage to the analysis based on (IV.C.1) as 
opposed to that based on Corr (Xñ, Rae) is apparent. For the constant 
coefficient models, LAR(1), Co (k) will have a spike at lag-0 and be 
zero for all other lags, since B. - 0. This will not be the case for 
the other NLAR(1) random coefficient processes or in the BELAR(1) 
process. It will not, however, distinguish the LAR(1) process from any 
linear AR(1) process. This, however, can be achieved using probability 
plots as mentioned previously. 

To derive a computational formula from (IV.C.1.) in terms of the 
parameters of the process, first let E ASK) - E(XTRÀ vi, Then, 


substituting in (1V.A.1) and (IV.A.2), we have, after some 


simplificationmrorsr sss, 


E52 (0) 


2 2 
E((oX 2 + BOX, + cee eee 


4 2 2 2 2 2 
Ele) E 2cE(e7) + EIERE *2Hc E(B’) 


+ 


3 4 
48cE (Bo) + 2HE(B ). (IV.C 2) 


For k 2 1, we have the recursion 


Bae Fe IER - (UE 


Again using the stationarity of {xX} and {Ro}, we have the following 


expression for the autocorrelation unction 
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1 K os 


22 ( ) E( B?) 
n E K- - 2E(R? k ec d 
Var (R^) 20! 1) ( n: ER < l 


For the non-LAR(1) cases of the NLAR(1) process, we substitute from 


Table IV.A.2 and equations (IV.A.4) and (IV.A.5) to obtain 


iov gp osa NB K = 0; 
Elk) = 1% my 
2 = = 2 nma 
aB] E (K 1) + 4(1 aa, Bia a 181), k 2 1] (IV GOD) 
1, [A 
Co, (k) = 


- 2 = - =y? R2 
Senne 2 3 2 202? 


k > 1. (IMs. 5) 


The corresponding results for the BELAR(1) process are 


12(2*aY?-3Y?), k 20; 

E(k) = 
aE, (k-1) + H(1-a)(1-Y7), kw]. CIV OE 
1, k = 0; 

C52 (Kk) = (a-¥?) (Ej, (k-1). - à (1-Y2)] 


av 


; K (IVC) 


MSI 

The theoretical autocorrelation functions for each of the models and 
sets of parameters in Table IV.A.1 are given in Figures 1V.C.1-3. There 
appears to be more discrimination between the TLAR(1) model and the 
other random coefficient models with On UA f n) than with 


Corr (Xi, R Nie even when |p| is small, as seen in comparing Figures 


nzk 
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IV.C.1 and IV.B.1. There is still little disorimination between the 
BELAR(1) model and the given non-boundary NLAR(1) model. However, the 
important point is that since the LAR(1) model is a linear AR(1) model, 


Cor (RR A = 0 for all values of p and for all k = £l, #2,.... 
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V. EXTENSIONS AND OPEN PROBLEMS 


A. INTRODUCTION 

During the discussion in the previous chapters, possible extensions 
and/or unresolved issues have been mentioned. At this point, we con- 
clude by summarizing some of the directions in which this research could 
be continued. There are still significant contributions to be made, 
particularly in parameter estimation, model development and 
applications. 

B. ESTIMATION 

There are several open questions and extensions in the area of 
parameter estimation and inference for this class of stochastic 
processes. 

First, there is the need to obtain theoretical results substantiat- 
ing the empirical results from the simulation of the maximum likelihood 
estimator (m.l.e.) of serial correlation in the TLAR(1) and the BELAR(1) 
processes. Several researchers have written on the subject of maximum 
likelihood estimation in dependent sequences. Much of this is assembled 
in the books by Basawa and Prakasa Rao [Ref. 42] and by Basawa and Scott 
[Ref. 43]. It is not known whether the conditions on the conditional 
densities are satisfied in the cases of these random coefficient AR(1) 
models to prove that the m.l.e. is consistent, asymptotically efficient 


or asymptotically Normal. Conditions for the existence of the m.l.e's 
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are generally extremely complicated and difficult to verify unless the 
log likelihood is absolutely continuous in the parameter space. 

A Second problem to resolve is that of existence and uniqueness of 
the maximum likelihood estimators of (a. , B.) in the NLAR(1) process. In 
this case, the log likelihood is definitely not differentiable with 


respect to the parameter, 8 nor is it clear that there is a unique 


1 , 
maximum. It appears from contour plots of the log-likelihood function 
over a grid of values in (a, , B4) coordinates that there is a unique 


local optimum within the region bounded by 0 < a, < 1 and -1 < 8. e 


1 
for large enough samples of (X 3. A non-linear optimization technique 
that uses only function values and not derivatives seems to be 
appropriate, since the log-likelihood function is not differentiable 
everywhere with respect to B. 

A third problem involves the &-Beta-Laplace AR(1) model. Except for 
the case when %£ is assumed to be one (the BELAR(1) model), the 
likelihood function in (a,%) has not been derived. This is primarily a 
numerical issue since neither the density of a for non-integer values 


of £ nor the conditional density of X given A, for any values of 


SCH 
£ > O have closed-form expressions. 

A fOurrhn issue in estimation"is tovextend the maximum likelihood 
approach to include the joint estimation of the scale parameter of the 
marginal distribution to that of the shape parameter and the serial 


correlation coefficient. There is no reason to assume that the marginal 


distribution should always be a standard Laplace or standard %-Laplace. 
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Finally, there is the issue of quantile estimation in the random 
coefficient models. Empirical results are given only for the BELAR(1) 
process for the distribution of the sample median. Theoretical results 
are related to mixing conditions. Based on a new mixing condition, 
which has been shown to be satisfied by linear AR(1) processes 
(Ref. 44], Gastwirth and Rubin derived the asymptotic Normal theory of 
quantile estimation for the linear LAR(1) process. The open question is 
whether the mixing condition of Gastwirth and Rubin is satisfied by any 
of the random coefficient models--NLAR( 1), BELAR(1) or £-Beta-Laplaoce 
AR(1). 

C. MODEL DEVELOPMENT 

Advances in modelling can be made in developing scalar models with 
p-th order autocorrelation structure, as well as bivariate autoregres- 
sive models. 

An open question in the development of the NLARMA(p,q) family of 
models is the existence of the general class of models with p-th order 
autocorrelation structure--NLAR(p) for p 2 3: specifically, it sio 
derive the distribution of the i.i.d. innovation sequence te}. This is 
only known for the TLAR(p) subclass of a proposed NLAR(p) family. 

A Similar question is open for p 2 2 in the continuous random 
coefficient models with an %-Laplace marginal distribution. The actual 
structure of the model, as well as the distribution of the innovation is 
in question. 

There is also a need for multivariate time series in many fields of 


physical science. The NEAR(2) framework was used by Dewald and Lewis 
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[Ref. 24] to derive a bivariate exponential AR(1) model. Such an 
extension is also possible with the NLAR(2) model. Just how one 
estimates the eight possible parameters in such a model is an open 
question. 

Related to the model development and parameter estimation is the 
need to identify particular models. Higher order residual analyses have 


been based on the linear residual i = X - a.X Since the 


n s s 
NLAR(2) model is only partially time reversible, it is possible that the 
meverosed residual d = X, = aX et S aX i+] could be used in model 
identification as well. These were introduced by Lawrance and Lewis 
[Ref. 6, 45] but their use has not been explored in any context. 

There is also the question of the effect that estimating a, and a, 
from (X ) Will have on the sample autocorrelation of (R^,R*, /) and tne 
cross-correlation of (X^,R, |) in the fourth-order residual analyses 
proposed in Chapter IV. 

D. APPLICATIONS 

In Chapter I, several areas have been noted where the modelling is 
accomplished with heavy-tailed distributions, notably in voice and 
acoustics modelling, as well as in image coding. In these areas, the 
Laplace distribution and the symmetric Gamma distributions are widely 
used. There is the possibility that the &-Laplace for 2% < 1 could also 
be a useful alternative to the symmetric Gamma. One advantage of tne 


£-Laplace distribution, which is the difference of two i.i.d. Gamma(£,2) 


is the simplicity of the form of the characteristic function. 
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Another field in which the &-Laplace models could be useful is in 
the modelling of the directional components of wind speed. Models with 
skewed marginal distributions have been fitted to data and then 
transformed either to Normals (for example by Brown, Katz and Murphy 
[Ref . 46]), or to Exponentials by Lawrance and Lewis [Ref. 6]. In both 
of the cited papers, the data indicated that the wind was almost always 
blowing. The question is, however, how does one model wind velocity 
when there are long calm periods. This is a problem from Australia as 
related by T. Lewis in the discussion of the NEAR(2) model [Ref. 6]. As 
can be seen in Figure III.C.1, for small values of 2, highly correlated 
periods of calm and wind can be generated using the 2-Beta-Laplace AR(1) 
model. 

The preceding examples demonstrate the opportunities for continued 


research and are not intended to narrow the focus of future endeavors. 
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VI. SUMMARY AND CONCLUSIONS 


We have indicated by reference to the soientific literature that there 
are important application areas, especially in the physical sciences of time 
Series whose marginal distributions are non-Normal. This feature, itself, 
presents new problems in the modelling, study and analysis. 

For those areas where the non-Normality manifests itself primarily in 
the thickness (heaviness) of the tails of the marginal distributions, we 
have demonstrated that within the £-Laplace family of distributions, there 
is an appropriate member with which to model phenomenon with a symmetric 
heavy-tailed marginal distribution. The £-Laplace family has very thick 
tails when £ is small and a limiting Normal distribution as 2 increases. 

To account for serial dependence in the time series we have derived two 
families of random processes that extends the random coefficient approach to 
modelling non-Normal time series. The discrete random coefficient models 
(NLARMA(p,q)) have a Laplace marginal distribution and the continuous random 
coefficient models (£-Beta-Laplace AR(1) and MA(q)) have an &-Laplace 
marginal distribution. Both families are additive models and imitate the 
linear Gaussian models in that they exhibit the usual ARMA(p,q) correlation 
structure. The models are parametrically parsimonious, structurally simple 
and easy to generate on a computer. 

We have also demonstrated that the fourth-order residual analyses based 
on the uncorrelated, but dependent sequence (RJ are appropriate and useful 


methods to discriminate between the discrete random coefficient and the 
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continuous random coefficient models when first, second and third-or der 
properties oe identical. 

For the purposes of parameter estimation, we derived the joint 
probability density function. Numerical routines were written to maximize 
the likelihood function to estimate the serial correlation coefficient in 
the BELAR(1) and the TLAR(1) processes. Simulation results indicated that 
this estimator was more efficient and less biased than the least squares 
estimator derived from the linear residual. 

Finally, we summarized some of the remaining issues in this field of 
non-Normal time series analysis. Extensions of the analyses in this thesis 
which need to be pursued are noted, along with possible applications in 


those previously mentioned fields of the physical sciences. 
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